Estimates of missing heritability for complex traits in Brown Swiss cattle

Román-Ponce, Sergio-Iván; Samoré, Antonia B; Dolezal, Marlies A; Bagnato, Alessandro; Meuwissen, Theo HE

doi:10.1186/1297-9686-46-36

Research
Open access
Published: 04 June 2014

Estimates of missing heritability for complex traits in Brown Swiss cattle

Sergio-Iván Román-Ponce^1,2,3,
Antonia B Samoré¹,
Marlies A Dolezal¹,
Alessandro Bagnato¹ &
…
Theo HE Meuwissen²

Genetics Selection Evolution volume 46, Article number: 36 (2014) Cite this article

1666 Accesses
8 Citations
Metrics details

Abstract

Background

Genomic selection estimates genetic merit based on dense SNP (single nucleotide polymorphism) genotypes and phenotypes. This requires that SNPs explain a large fraction of the genetic variance. The objectives of this work were: (1) to estimate the fraction of genetic variance explained by dense genome-wide markers using 54 K SNP chip genotyping, and (2) to evaluate the effect of alternative marker-based relationship matrices and corrections for the base population on the fraction of the genetic variance explained by markers.

Methods

Two alternative marker-based relationship matrices were estimated using 35 706 SNPs on 1086 dairy bulls. Both pedigree- and marker-based relationship matrices were fitted simultaneously or separately in an animal model to estimate the fraction of variance not explained by the markers, i.e. the fraction explained by the pedigree. The phenotypes considered in the analysis were the deregressed estimated breeding values (dEBV) for milk, fat and protein yield and for somatic cell score (SCS).

Results

When dEBV were not sufficiently accurate (50 or 70%), the estimated fraction of the genetic variance explained by the markers was around 65% for yield traits and 45% for SCS. Scaling marker genotypes with locus-specific frequencies of heterozygotes slightly increased the variance explained by markers, compared with scaling with the average frequency of heterozygotes across loci. The estimated fraction of the genetic variance explained by the markers using separately both relationships matrices followed the same trends but the results were underestimated. With less accurate dEBV estimates, the fraction of the genetic variance explained by markers was underestimated, which is probably an artifact due to the dEBV being estimated by a pedigree-based animal model.

Conclusions

When using only highly accurate dEBV, the proportion of the genetic variance explained by the Illumina 54 K SNP chip was approximately 80% for Brown Swiss cattle. These results depend on the SNP chip used and the family structure of the population, i.e. more dense SNPs and closer family relationships are expected to result in a higher fraction of the variance explained by the SNPs.

Background

Genome-wide dense marker arrays that are available for livestock populations cover all chromosomes with dense single nucleotide polymorphism (SNP) markers [1]. Many dairy cattle populations are currently being genotyped using these arrays [2–4]. The main objective is to apply genomic selection (GS) [5]. GS allows prediction of the genetic merit of young animals based on marker information in the absence of own performance data. The marker effects are estimated in a reference population, which must have both genotypic and phenotypic records. In the case of dairy bulls, phenotypic data come from genetic evaluations in the form of daughter yield deviation (DYD) or deregressed estimated breeding values (dEBV) [6].

Identity by descent (IBD) alleles refer to alleles that descend from a common ancestor in the base population [7]. The coefficient of coancestry between two animals is defined as the probability that two randomly sampled alleles from the two animals are IBD [8], and twice the coancestry is defined as their numerator relationship [8]. This approach leads to the estimation of a matrix of relationships based on the pedigree information. The latter is fundamental to estimate the genetic parameters for complex traits such as heritability (defined as the proportion of the phenotypic variance in a population that is attributed to additive genetic effects). The relationship matrix based on pedigree data dates back to a base population, for which parents are unknown and which is considered unrelated, unselected and non-inbred. The choice of the base population affects the estimate of the additive genetic variance [9].

However, the relationship matrix can also be estimated from genome-wide genetic markers such as panels of SNPs [10–12]. Methods have been developed to construct such marker-based relationship matrices [12–15]. Recently, these relationship matrices have been used to dissect the additive genetic variance of complex traits [16].

The proportion of the genetic variance not captured by markers (C_miss) represents the variance that cannot be used by GS and affects the maximum accuracy that can be achieved by GS [17]. The term ‘missing heritability’ [18] describes the fact that marker-phenotype associations identified in genome-wide association studies do not explain all the genetic variance in complex traits (e.g. height in humans). Some strategies have been proposed to reduce C_miss: (1) increasing the sample size in order to also detect genes with smaller effects, (2) expanding the studies to non-European samples in human genetics, (3) enlarging the collection of phenotypes to explore gene-gene interactions, (4) changing the structure of the training population, mainly in terms of the relatedness of the included individuals, and (5) moving to the genomic selection approach instead of estimating the marker effect for each SNP individually [13, 19, 20]. In animal breeding, some results suggest that the Illumina Bovine54K chip array (Illumina Inc., San Diego, CA) does not capture all the additive genetic variation for all dairy traits [21–23], even when using the GS approach, it estimates simultaneously all the SNP effects.

The main objective of this study was to estimate the fraction of the genetic variance not explained by the 54 K Illumina SNP chip. Two alternative marker-based relationship matrices were used for analysis.

Methods

Genotypic and phenotypic data

A total of 1092 Italian Brown Swiss bulls were genotyped with the Illumina Bovine54K chip (Illumina Inc., San Diego, CA). These bulls were born between 1963 and 2002. Figure 1 shows the distribution of the genotyped bulls over the birth years. All the SNPs on the X-chromosome were excluded from the analysis, which left 51 582 markers. The quality control process removed 1421 SNPs that had more than 5% missing genotypes and 14 455 SNPs with a minor allele frequency lower than 5%. Six sires were deleted because their genotyping rate was lower than 95%. Editing was performed with two different software packages: SAS^® (SAS Inst. Inc., Cary, NC) and PLINK v1.07 [24]. At the end of the quality control process, genotypes were available for 1086 sires with 35 706 SNPs and with a missing genotype rate of 0.66%.

The phenotypic data available were the EBV for fat yield (FAT), milk yield (MILK), protein yield (PROT) and somatic cell score in milk (SCS) for each bull, which were calculated by the Italian National Association of Brown Swiss (ANARB). The EBV were deregressed as proposed by Garrick [21], in order to eliminate the shrinkage contained in the EBV and to remove ancestral information. The deregressed EBV (dEBV) were used as phenotypic records for the bulls with heritability equal to the reliability of the EBV.

Three subsets were formed according to the reliability of EBV as follows: animals with a reliability of at least 50% for each trait; animals with a reliability greater than 70% for each trait; animals with a reliability of at least 90% for each trait.

Relationship matrices: A and G

A pedigree file was extracted from the Italian Brown Swiss herd book. Pedigree was traced back five generations and the pedigree file included 6826 entries. The completeness in the pedigree was 100% up to the grandparents, and decreased to ~90% thereafter. The equivalent number of known generations as calculated by the software Pedig [25] was on average 5.14 and the median was 5.23. The pedigree file was used to estimate the additive genetic relationships (A) with an adapted version of the procedure proposed by Meuwissen and Luo [26], as implemented in ASREML [27].

Two genomic relationship matrices (G) were computed for all genotyped animals. The first G_V was based on the method proposed by VanRaden [12]. Let M be the marker-genotype matrix with number of individuals (n) and number of loci (m) as dimensions. The elements in the matrix M were coded as -1 (homozygous for one allele) 0 (heterozygous) and 1 for (homozygous for the other allele). The nxm matrix P contains columns with all elements 2(p_i-0.5), where p_i is the frequency of the second allele at locus i. The matrix P was subtracted from M to give Z = M - P. Finally, matrix G_V was calculated as:

G_{V} = \frac{Z Z^{'}}{2 \sum_{i = 1}^{m} p_{i} (1 - p_{i})} .

The second genomic relationship matrix (G_Y) was computed as:

G_{Y} = \frac{W W^{'}}{m},

where W is the Z matrix but with each element scaled based on the allele frequency of each locus as follows: $w_{ij} = \frac{Z_{ij}}{\sqrt{2 p_{j} (1 - p_{j})}}$ [12, 14].

Correction for the base population

Both the G matrix and the pedigree-based relationship matrix, A, are expressed relative to a base population, i.e. an original population in which all animals are assumed unrelated and non-inbred, and these populations may differ between the pedigree-based and genomic relationship matrices [15]. To correct for these differences, the scale of G was changed to that of A based on Wright’s F-statistic [7]. We expressed the total inbreeding of animal i in the G matrix as:

F_{it} = G_{ii} - 1 or F_{it} = F_{st} + (1 - F_{st}) F_{is},

where F_st is the average inbreeding in the population, i.e. the average of the diagonal elements of G minus 1, and F_is is the inbreeding of animal i relative to the population average inbreeding F_st, which is calculated as: $F_{is} = \frac{(F_{it} - F_{st})}{(1 - F_{st})} = \frac{(G_{ii} - 1 - F_{st})}{(1 - F_{st})} .$

The average population inbreeding of G was set equal to that of A by rescaling the diagonal element of G corresponding to individual i as:

G_{jj}^{*} = A_{st} + (1 - A_{st}) F_{st} + 1,

Where A_st is the average of the diagonals of A minus 1. The off-diagonals of G were rescaled similarly, using the same F_st and A_st values. Numerator relationships were transformed to kinships, ∅, i.e. by dividing the relationship by 2, and performing the base-correction on the kinship level, which is the same level as that of inbreeding, i.e.

\emptyset_{jis} = \frac{(\frac{G_{ji}}{2} - F_{st})}{(1 - F_{st})}, and

G_{ji}^{*} = 2 [A_{st} + (1 - A_{st}) \emptyset_{jis}],

where ∅ _jis is the kinship of animal j and i relative to the base population inbreeding, F_st.

Estimation of variance components

To estimate the fraction of the genetic variance captured by dense markers covering the entire genome, the approach of Goddard et al. [28] was used. Both matrix A and G were fitted in the model simultaneously in order to estimate the fraction of the genetic variance captured by each of these matrices. The variance component analyses were performed by ASREML-R [29], using the following model:

y = 1 μ + Z_{1} a + Z_{2} u + e,

where y is the vector of the dEBV; μ is the overall mean; Z₁ and Z₂ are the incidence matrices for pedigree-based and genomic random animal effects, respectively; a is the vector of the random additive genetic animal effects using the pedigree-based relationship matrix, with a ~ N(0, A σ²_a); u is the vector of random additive genetic effect using the genomic relationship matrix, with u ~ N(0, G σ²_u); and finally, e is the vector of random residual effects. Because the number of daughters per bull was high for all bulls, the reliabilities of the dEBV were high and varied little between bulls, and a homogeneous error variance structure was assumed.

If we assume that A is an unbiased estimate of G, and write G = A + D[28], where D is a matrix of deviations from pedigree relationships due to the segregation of a finite number of chromosome segments in the genome, the genetic variance of the records becomes V(g) = G σ_u² + A σ_a² = A(σ_u² + σ_a²) + D σ_u². Hence, as in a model that fits only pedigree relationships (y = 1 μ + Z₁ a + e), the total genetic variance is explained by the A matrix and the segregation of chromosome segments that are traced by the markers is explained by σ_u². The fraction of genetic variance not captured by the markers on the SNP chip (C_miss) was thus estimated as:

C_{miss} = 1 - \frac{σ_{u}^{2}}{σ_{g}^{2}} = 1 - \frac{σ_{u}^{2}}{(σ_{a}^{2} + σ_{u}^{2})},

where σ²_g is the total genetic variance, σ²_u is the variance due to marker-based relationships and σ²_a is the variance due to pedigree-based relationships.

The two additive genetic variances were also estimated by fitting each separately: the additive genetic animal variance using the pedigree-based relationship matrix ( $σ_{a 0}^{2}$ ) and the additive genetic variance using the genomic relationship matrix ( $σ_{g 0}^{2}$ ). The estimate of $σ_{a 0}^{2}$ was used to calculate an alternative estimate for the fraction of genetic variance not addressed by the markers on the SNP chip (C_{miss 2}) as follows: $C_{miss 2} = 1 - \frac{σ_{u 0}^{2}}{σ_{a 0}^{2}}$ . The estimate C_{miss 2} has the advantage that σ²_{a 0} is known to yield an unbiased estimate of the genetic variance, but it has the disadvantage that σ²_{u 0} is likely to include more genetic variance than that explained by QTL that are in LD with the markers [11]. E.g. if only some of the chromosomes contain markers, these markers can explain genetic variance at the unmarked chromosomes, because the markers trace family relationships. If, in the latter case, the pedigree-based relationship matrix is fitted simultaneously with the marker-based relationship matrix, the variance due to the unmarked chromosomes is expected to be included in the polygenic variance, σ²_a, because the pedigree-based relationship matrix more closely resembles the family relationships at the unmarked chromosomes than at the marked chromosomes, which may show relationships that (randomly) deviate from the pedigree. Thus, C_{miss 2} is expected to underestimate the fraction of missing genetic variance.

Results

Descriptive statistics

Descriptive statistics for each trait and dataset are in Table 1. In the group of bulls with dEBV reliabilities of at least 50%, the dEBV average reliability was ~90% (±7%) for the production traits (FAT, PROT and MILK), and 82.6% (±10.7%) for SCS. The subset of sires with dEBV reliabilities of at least 70% had a similar average reliability of ~91% (±5%) for the production traits. The lowest average reliability in this subset was 85.7% (±7.4%) for SCS. Finally, the subset of bulls with reliabilities of at least 90% had an average reliability close to ~94% (±3%) for all traits. As expected, the differences in the average of the reliabilities between traits tended to decrease with increasing minimum reliability requirements.

Table 1 Descriptive statistics for de-regressed estimated breeding values (dEBV) and reliabilities (r ² ) for production traits*

Full size table

Proportion of genetic variance not explained by markers

The fraction of genetic variance not explained by molecular markers based on C_miss was estimated for all datasets (50, 70 and 90 dEBV reliabilities) and traits (FAT, PROT, MILK and SCS). Results are in Table 2. For dFAT50, the estimate of C_miss was 0.373 ± 0.068 based on G_V and 0.363 ± 0.069 based on G_Y. The estimates of C_miss were smaller for the dFAT70 subset than for the dFAT50 subset. For dFAT90, the estimate was 0.305 ± 0.074 G_V, while the G_Y matrix did not result in converged variance component estimates. Algorithms other than the AI-REML algorithm might have converged (e.g. the EM-algorithm, which is known to be slow), but the convergence difficulties are probably due to the small size of the dataset, thus resulting variance component estimates would have been unreliable.

Table 2 Proportion of genetic variance not explained by markers ( C _miss ) ± standard error (SE) for dEBV for production traits* ¹

Full size table

The fraction of the genetic variance not explained by molecular markers based on C_{miss 2} through the additive genetic variances was estimated separately for all datasets and traits (Table 3). Results for C_{miss 2} followed the same trends as for C_miss but the values of C_{miss 2} were lower probably due to its underestimation of the fraction of the missing genetic variance.

Table 3 Proportion of genetic variance not explained by markers ( C _{miss 2} ) for dEBV for production traits* ¹

Full size table

Results for dMILK, dPROT and dSCS were similar to those described above for dFAT for both genomic relationship matrices. Estimates of C_miss for dMILK70 and dPROT70 hardly differed from those for dMILK50 and dPROT50, respectively. The subsets with dEBV90 resulted in estimates of C_miss of 0.199 (±0.101) for dMILK90 and 0.206 (±0.098) for dPROT90 when using G_Y. These estimates were not significantly different from those obtained with the larger datasets for the same traits (dEBV50 or dEBV70), although they were systematically lower for all traits.

The highest estimates for C_miss were obtained for dSCS50, with 0.532 (±0.091) for G_V. When using G_Y, the corresponding C_miss estimate was lower (0.486 ± 0.095). The smallest C_miss estimate was obtained for dSCS90: 0.061 (±0.197) using G_Y. The variance component analysis with G_V on the same dataset did not converge. This was the smallest dataset and, although the average reliability was the highest, estimates of C_miss were not significantly different from 0.

In general, estimates of C_{miss 2} decreased as the reliability of the dEBV increased. Estimates of C_{miss 2} differed from estimates of C_miss, probably because C_miss2 is expected to underestimate the fraction of the missing genetic variance.

Discussion

We estimated the fraction of the genetic variance not accounted by SNPs in the marker panel (C_miss) based on the Illumina 54 K SNP chip for complex traits in dairy cattle. The results showed that the estimates of C_miss depended on the reliability of the phenotypic traits considered, i.e. the dEBV used as response values. When the accuracy of the dEBV increases, i.e. when the correlation between dEBV and the true breeding value increases, the proportion of the genetic variance explained by SNPs tended to increase. When the reliability of the dEBV is low, the family/pedigree information greatly contributes to the estimation of the EBV, which results in a larger fraction of the variance being explained by A and, in turn, in upward biases of C_miss. Because the estimates of the C_miss values, are expected to be overestimated due to the use of (family information in) dEBV, the best estimates of C_miss are obtained for data sets with high reliabilities, which resulted in estimates around 0.2. This implies that the maximum accuracy of GEBV is √(1-C_miss) ≈ 0.9, which agrees with the result of Daetwyler [22], who studied the increase in the accuracy of GEBV with increasing training population sizes.

For all production traits, the fraction of the genetic variance not explained by the SNPs was significantly different from 0, even when the phenotypes were very accurate (reliability > 90%), and were, therefore, very close to the true breeding values. Correction for the base population did not affect the fraction of the genetic variance explained by markers for any of the marker-based relationships here used. The differences in C_miss estimates between using G_V and G_Y were negligible for all traits and all subsets. Similarly, when using EBV instead of dEBV (results not shown), the results were virtually the same.

If original performance records of production and SCS phenotypes are used to estimates C_miss, instead of dEBV, the upward biases mentioned above are not expected to occur. The error variances would be higher than when using dEBV, but the value of σ²_a would not be inflated, because family information does not contribute to own phenotype (in contrast to dEBV phenotypes).

The sources of phenotypic information used in genomic analyses are very heterogeneous and vary from individuals with highly reliable information, i.e. progeny-tested bulls, and animals with phenotypes with low levels of accuracy, i.e. young cows. To take into account these differences in reliability in a weighted analysis, it is necessary to know the value of C_miss for each phenotype [22]. In addition, a polygenic effect must be included in the model to account for unmarked genetic effects. Knowledge of the fraction of the genetic variance not explained by markers is also required to predict the accuracy of the genomic predictions for each individual in the population, since it affects the maximum accuracy that can be achieved [17].

The base population correction of the genomic relationship matrix generally affected neither the proportion of genetic variance captured by markers, nor the genetic variance captured by the pedigree-based relationship matrices, which agrees with [17, 30] but not with [31]. The latter authors, however, scaled the relationships in the opposite direction, i.e. when G relationships were too high, they scaled all relationships downwards, which further decreased the differences in relationships that were already small since relationships are bound by a maximum of 1 (and vice-versa when G relationships were too small). Moreover, the correction for the base population facilitates the integration of relationship matrices A and G into a single matrix (H), according to Legarra et al. [32], Christensen and Lund [13], and Meuwissen et al. [15].

We also estimated C_{miss 2} using the pedigree-based estimate of genetic variance. The denominators of C_miss and C_{miss 2} were significantly different from each other but both estimates revealed that the genomic relationship matrix could explain more than 95% of genetic variance if sufficiently reliable phenotypes are used (with reliabilities greater than 95%).

It should be noted that the estimates of C_miss and C_{miss 2} depend on the SNP chip used, i.e. more dense SNP chips are expected to yield lower estimates of C_miss and C_{miss 2} (a larger fraction of the variance is explained by the SNPs), and also on the family structure of the population [33]. Populations with more closely related individuals are expected to yield high LD between SNPs and QTL, even when they are physically quite far apart and, therefore, lower estimates of C_miss. The population structure of the Italian Brown Swiss population reflects that of a typical dairy breeding population, and, thus, our results probably apply also to other dairy breeding populations.

Conclusions

The fraction of genetic variance explained by genetic markers from high-density SNP panels was significantly different from 0 for the complex traits analyzed when the phenotypes are not highly accurate. The minimum fraction of the genetic variance not explained by the markers (C_miss) was equal to 0.2, which was estimated based on the most accurate phenotypes. This value agrees with other values reported in the literature. Correction of the genomic relationship matrix for the variance of the allele frequency of each locus (G_Y) instead of the average frequency of heterozygotes (G_V), hardly explained any additional genetic variance. Our estimate of C_miss of 0.2 implies that about 80% of the genetic variance is explained by the Illumina 54 K SNP chip. Values for C_miss are expected to depend on the density of the chip (a larger SNP chip is expected to explain a larger fraction of the genetic variance) and on family relationships in the population, i.e. closer family relationships are expected to reduce C_miss.

References

Matukumalli LK, Lawley CT, Schnabel RD, Taylor JF, Allan MF, Heaton MP, O’Connell J, Moore SS, Smith TPL, Sonstegard TS, Van Tassell CP: Development and characterization of a high density SNP genotyping assay in cattle. PLoS ONE. 2009, 4: e5350-
Article PubMed Central PubMed Google Scholar
Berry DP, Kearney F, Harris B: Genomic selection in Ireland. Interbull Bull. 2009, 39: 29-34.
Google Scholar
Schenkel FS, Sargolzaei M, Kistemaker G, Jansen GB, Sullivan P, Van Doormaal BJ, VanRaden PM, Wiggans GR: Reliability of genomic evaluation of Holstein cattle in Canada. Interbull Bull. 2009, 39: 51-58.
Google Scholar
VanRaden PM, Van Tassell CP, Wiggans GR, Sonstegard TS, Schnabel RD, Taylor JF, Schenkel FS: Invited review: Reliability of genomic predictions for North American Holstein bulls. J Dairy Sci. 2009, 92: 16-24.
Article CAS PubMed Google Scholar
Meuwissen THE, Hayes BJ, Goddard ME: Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001, 157: 1819-1829.
PubMed Central CAS PubMed Google Scholar
Calus MPL: Genomic breeding values prediction: Methods and procedures. Animal. 2010, 4: 157-164.
Article CAS PubMed Google Scholar
Wright S: Coefficients of inbreeding and relationship. Am Nat. 1922, 56: 330-338.
Article Google Scholar
Malécot G: Les Mathématiques de l’Hérédité. 1948, Paris: Masson et Cie
Google Scholar
van der Werf JH, de Boer IJ: Estimation of additive genetic variance when base populations are selected. J Anim Sci. 1990, 68: 3124-3132.
CAS PubMed Google Scholar
Fernando RL: Proceedings of the 6th World Congress in Genetics Applied to Livestock Production: 11–16 January 1998; Armidale. 1998, 329-336. Genetic evaluation and selection using genotypic, phenotypic and pedigree information, 26,
Google Scholar
Habier D, Fernando RL, Dekkers JCM: The impact of genetics relationship information on genome-assisted breeding values. Genetics. 2007, 177: 2389-2397.
PubMed Central CAS PubMed Google Scholar
VanRaden PM: Efficient methods to compute genomic predictions. J Dairy Sci. 2008, 91: 4414-4423.
Article CAS PubMed Google Scholar
Christensen OF, Lund MS: Genomic prediction when some animals are not genotyped. Genet Sel Evol. 2010, 42: 2-
Article PubMed Central PubMed Google Scholar
Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW, Goddard ME, Visscher PM: Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010, 42: 565-569.
Article PubMed Central CAS PubMed Google Scholar
Meuwissen THE, Luan T, Woolliams JA: The unified approach to the use of genomic and pedigree information in genomic evaluations revisited. J Anim Breed Genet. 2011, 128: 429-439.
Article CAS PubMed Google Scholar
Lee SH, Goddard ME, Visscher PM, van der Werf JHJ: Using the realized relationship matrix to disentangle confounding factors for the estimation of genetic variance components of complex traits. Genet Sel Evol. 2010, 42: 22-
Article PubMed Central PubMed Google Scholar
Dekkers JC: Prediction of response to marker-assisted and genomic selection using selection index theory. J Anim Breed Genet. 2007, 124: 331-341.
Article CAS PubMed Google Scholar
Maher B: Personal genomes: The case of the missing heritability. Nature. 2008, 456: 18-21.
Article CAS PubMed Google Scholar
Manolio TA, Collins FS, Cox NJ, Golstein DB, Hindoff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, Cho JH, Guttmacher AE, Kong A, Kruglyak L, Mardis E, Rotimi CN, Slatkin M, Valle D, Whittemore AS, Boenhnke M, Clark AG, Eichler EE, Gibson G, Haines JL, Mackay TFC, McCarrol SA, Visscher PM: Finding the missing heritability of complex diseases. Nature. 2009, 461: 747-753.
Article PubMed Central CAS PubMed Google Scholar
Makowsky R, Pajewski NM, Klimentidis YC, Vazquez IA, Duarte CW, Allison DB, de los Campos G: Beyond missing heritability: prediction of complex traits. PLoS Genet. 2011, 7: e1002051-
Article PubMed Central CAS PubMed Google Scholar
Garrick DJ, Taylor JT, Fernando RL: Deregressing estimated breeding values and weighting information for genomic regression analyses. Genet Sel Evol. 2009, 41: 55-
Article PubMed Central PubMed Google Scholar
Daetwyler HD: Genome-Wide Evaluation of Populations. PhD Thesis. 2009, Wageningen: Wageningen University
Google Scholar
Haile-Mariam M, Nieuwhof GJ, Beard KT, Konstatinov KV, Hayes BJ: Comparison of heritabilities of dairy traits in Australian Holstein-Friesian cattle from genomic and pedigree data and implications for genomic evaluations. J Anim Breed Genet. 2013, 130: 20-31.
Article CAS PubMed Google Scholar
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ, Sham PC: PLINK: a toolset for whole-genome association and population-based linkage analysis. Am J Hum Genet. 2007, 81: 559-575.
Article PubMed Central CAS PubMed Google Scholar
Boichard D, Maignel L, Verrier E: The value of using probabilities of gene origin to measure genetic variability in a population. Genet Sel Evol. 1997, 29: 5-23.
Article PubMed Central Google Scholar
Meuwissen THE, Luo Z: Computing inbreeding coefficients in large populations. Genet Sel Evol. 1992, 24: 305-313.
Article PubMed Central Google Scholar
Gilmour AR, Gogel BJ, Cullis BR, Thompson R: ASREML User Guide Release 3.0. 2009, Queensland, Australia: The Department of Primary Industries and Fisheries
Google Scholar
Goddard ME, Hayes B, Meuwissen THE: Using the genomic relationship matrix to predict the accuracy of genomic selection. J Anim Breed Genet. 2011, 128: 409-421.
Article CAS PubMed Google Scholar
Butler D, Cullis B, Gilmour A, Gogel B: ASReml-R Reference Manual, Version 3. 2009, Queensland, Australia: The Department of Primary Industries and Fisheries
Google Scholar
Sorensen DA, Kennedy BW: Estimation of genetic variances from unselected and selected populations. J Anim Sci. 1984, 59: 1213-1223.
Google Scholar
Forni S, Aguilar I, Misztal I: Different genomic relationship matrices for single-step analysis using phenotypic, pedigree and genomic information. Genet Sel Evol. 2011, 43: 1-
Article PubMed Central PubMed Google Scholar
Legarra A, Aguilar I, Misztal I: A relationship matrix including full pedigree and genomic information. J Dairy Sci. 2009, 92: 4656-4663.
Article CAS PubMed Google Scholar
Jensen J, Su G, Madsen P: Partitioning additive genetic variance into genomic and remaining polygenic components for complex traits in dairy cattle. BMC Genet. 2012, 13: 44-
Article PubMed Central CAS PubMed Google Scholar

Download references

Acknowledgements

The helpful comments of three reviewers are gratefully acknowledged. We gratefully acknowledge the Italian Brown Cattle Breeders’ Association (ANARB) for collecting, handling and sharing data. The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 222664. (“Quantomics”). This article reflects only the author’s views and the European Community is not liable for any use that may be made of the information contained herein.

Author information

Authors and Affiliations

Dipartimento di Scienze e Tecnologie Veterinarie per la Sicurezza Alimentare, Università degli Studi di Milano, Via Celoria 10, Milano, 20133, Italia
Sergio-Iván Román-Ponce, Antonia B Samoré, Marlies A Dolezal & Alessandro Bagnato
Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, P.O. Box 5003, Oslo, N-1432, Ås, Norway
Sergio-Iván Román-Ponce & Theo HE Meuwissen
Instituto Nacional de Investigaciones Forestales Agrícolas y Pecuarias, C.E. Valles Centrales, CIRPAS, Melchor Ocampo 7, Etla, Oaxaca, 68200, México
Sergio-Iván Román-Ponce

Authors

Sergio-Iván Román-Ponce
View author publications
You can also search for this author in PubMed Google Scholar
Antonia B Samoré
View author publications
You can also search for this author in PubMed Google Scholar
Marlies A Dolezal
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Bagnato
View author publications
You can also search for this author in PubMed Google Scholar
Theo HE Meuwissen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sergio-Iván Román-Ponce.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

SIRP performed the study and drafted the manuscript. ABS contributed to writing the draft. SIRP, MAD and AB prepared the genotypic and phenotypic data. THEM planned and coordinated the whole study, and contributed to writing the manuscript. All the authors read and approved the final manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.

Reprints and permissions

About this article

Cite this article

Román-Ponce, SI., Samoré, A.B., Dolezal, M.A. et al. Estimates of missing heritability for complex traits in Brown Swiss cattle. Genet Sel Evol 46, 36 (2014). https://doi.org/10.1186/1297-9686-46-36

Download citation

Received: 24 January 2013
Accepted: 28 April 2014
Published: 04 June 2014
DOI: https://doi.org/10.1186/1297-9686-46-36

Estimates of missing heritability for complex traits in Brown Swiss cattle

Abstract

Background

Methods

Results

Conclusions

Background

Methods

Genotypic and phenotypic data

Relationship matrices: A and G

Correction for the base population

Estimation of variance components

Results

Descriptive statistics

Proportion of genetic variance not explained by markers

Discussion

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Authors’ original submitted files for images

Authors’ original file for figure 1

Rights and permissions

About this article

Cite this article

Keywords

Genetics Selection Evolution

Contact us

Estimates of missing heritability for complex traits in Brown Swiss cattle

Abstract

Background

Methods

Results

Conclusions

Background

Methods

Genotypic and phenotypic data

Relationship matrices: A and G

Correction for the base population

Estimation of variance components

Results

Descriptive statistics

Proportion of genetic variance not explained by markers

Discussion

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Authors’ original submitted files for images

Authors’ original file for figure 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Genetics Selection Evolution

Contact us