Principal component and factor analytic models in international sire evaluation
1 Biotechnology and Food Research, Biometrical Genetics, MTT Agrifood Research Finland,31600 Jokioinen, Finland
2 Animal Genetics and Breeding Unit, University of New England, Armidale NSW 2351, Australia
3 Department of Animal Breeding and Genetics, SLU, Box 7023, S-75007 Uppsala, Sweden
4 UMR 1313 INRA, Génétique Animale et Biologie Intégrative, 78352 Jouy-en-Josas Cedex, France
5 Interbull Centre, Department of Animal Breeding and Genetics, SLU, Box 7023, S-75007 Uppsala, Sweden
Genetics Selection Evolution 2011, 43:33 doi:10.1186/1297-9686-43-33Published: 23 September 2011
Interbull is a non-profit organization that provides internationally comparable breeding values for globalized dairy cattle breeding programmes. Due to different trait definitions and models for genetic evaluation between countries, each biological trait is treated as a different trait in each of the participating countries. This yields a genetic covariance matrix of dimension equal to the number of countries which typically involves high genetic correlations between countries. This gives rise to several problems such as over-parameterized models and increased sampling variances, if genetic (co)variance matrices are considered to be unstructured.
Principal component (PC) and factor analytic (FA) models allow highly parsimonious representations of the (co)variance matrix compared to the standard multi-trait model and have, therefore, attracted considerable interest for their potential to ease the burden of the estimation process for multiple-trait across country evaluation (MACE). This study evaluated the utility of PC and FA models to estimate variance components and to predict breeding values for MACE for protein yield. This was tested using a dataset comprising Holstein bull evaluations obtained in 2007 from 25 countries.
In total, 19 principal components or nine factors were needed to explain the genetic variation in the test dataset. Estimates of the genetic parameters under the optimal fit were almost identical for the two approaches. Furthermore, the results were in a good agreement with those obtained from the full rank model and with those provided by Interbull. The estimation time was shortest for models fitting the optimal number of parameters and prolonged when under- or over-parameterized models were applied. Correlations between estimated breeding values (EBV) from the PC19 and PC25 were unity. With few exceptions, correlations between EBV obtained using FA and PC approaches under the optimal fit were ≥ 0.99. For both approaches, EBV correlations decreased when the optimal model and models fitting too few parameters were compared.
Genetic parameters from the PC and FA approaches were very similar when the optimal number of principal components or factors was fitted. Over-fitting increased estimation time and standard errors of the estimates but did not affect the estimates of genetic correlations or the predictions of breeding values, whereas fitting too few parameters affected bull rankings in different countries.