By: Everestus Akanno, PhD., Geneticist, Genesus Inc.
The success of pig genetic improvement programs depends heavily on the continued generation, collection, and analysis of industry-wide data from both the nucleus and commercial settings.
These data come from a variety of sources including progeny test facilities, packing plants, nucleus and commercial barns, and DNA laboratories (Fig. 1) which are typically characterised by a fast-paced work environment where data generation and collection happen quickly and continuously and people may not have data accuracy as their top priority, thus, creating room for possible errors. In addition, advancements in technology and computation have allowed for automation of data generation and integration into a remote database which can reduce potential errors. Nevertheless, questions still exist on the quality and validity of industry-wide data for comparing cohorts, evaluating genetic merit, and making selection decisions.
What is data integrity?
In the context of genetic improvement programs, data integrity is defined as the extent to which data collected on an individual is complete, consistent, accurate and reliable for genetic evaluation purposes. According to guidelines provided by International Council on Animal Recording (ICAR, 2018), a complete and accurate record on an animal should have the following attributes:
- Animal identification – The animal should be properly identified using any suitable identification methods.
- Parentage verification – The parentage of the individual should be verified and trackable.
- Dates of recording – The dates of birth and dates of measurements should be complete and accurate.
- Phenotypic values – The value of the animal record of production or performance level should be within allowable published baselines for the traits and breed.
- Systematic effects – Factors known to be associated with the record of performance for an individual should be noted and properly documented.
Issues with data integrity in pig production
Data collection and interpretation form the foundation for the many decisions made in the pig industry. Generation of significant amounts of data has become a normal part of the pig genetic improvement business, especially with the advent of genomic technology. However, human errors and failure of automated systems can compromise data integrity. Examples of potential issues with data integrity include but are not limited to the following:
- Mislabelling of samples (e.g. for genotyping purposes).
- Poor handling of samples during storage which may result in missing data.
- Incorrect animal identification.
- Incorrect assignment of parentage.
- Error in data entry.
- Failure in automated measurement systems leading to inaccurate measurements or a break in timed measurements (e.g. individual feed intake equipment).
- Inaccurate ultrasound recording from inexperienced or untrained technicians
Efforts to mitigate these issues will go a long way to improve the quality and integrity of data used for genetic evaluation, thus, leading to more accurate estimation of genetic merit.
Measures to improve data integrity in genetic evaluation systems
As previously noted, mistakes in parentage assignment and in the linking of data (genotype or phenotype) to the right animals in the recording system can be very disastrous and undermine the predictive power of the genetic evaluation system. The most important key to data integrity is people. Staff that have a keen interest in and understanding of the importance of quality data are the most valuable resource to ensure data integrity. Therefore, data integrity needs to be frequently monitored by keeping a close eye on the following areas:
- Data from various sources need to be verified and interrogated before integrating into the database.
- All software that supports data collection, data processing and data reporting need to be regularly validated.
- Access to the database should be restricted to individuals responsible for data collection and management.
- All persons involved with data collection and analysis should be trained and maintain certification, as appropriate.
- Quality control measures should be in place and automated to identify potential errors in data entry.
- Data usage and analysis should include steps for identifying, visualising, and filtering erroneous data.
As a leading global pig genetic company, Genesus Inc. takes data integrity very seriously. Our dedicated staff consider data integrity as the highest priority. We continuously monitor data integrity and have established measures for identifying and excluding erroneous data from entering the database. In addition, the Genesus Genetic Team is continuously researching and developing novel approaches for improving the quality of data used in the estimation of genetic merit, thus, delivering the best genetics to our customers.