Accuracy of genotype imputation to whole genome sequencing level using different populations of Nile tilapia

Registro completo de metadados
MetadadosDescriçãoIdioma
Autor(es): dc.contributorUniversidade Estadual Paulista (UNESP)-
Autor(es): dc.contributorUniversidad de Chile-
Autor(es): dc.contributorNational Council for Scientific and Technological Development (CNPq)-
Autor(es): dc.creatorGarcia, Baltasar F.-
Autor(es): dc.creatorYoshida, Grazyella M.-
Autor(es): dc.creatorCarvalheiro, Roberto-
Autor(es): dc.creatorYáñez, José M.-
Data de aceite: dc.date.accessioned2025-08-21T21:58:07Z-
Data de disponibilização: dc.date.available2025-08-21T21:58:07Z-
Data de envio: dc.date.issued2022-04-28-
Data de envio: dc.date.issued2022-04-28-
Data de envio: dc.date.issued2022-03-30-
Fonte completa do material: dc.identifierhttp://dx.doi.org/10.1016/j.aquaculture.2022.737947-
Fonte completa do material: dc.identifierhttp://hdl.handle.net/11449/223301-
Fonte: dc.identifier.urihttp://educapes.capes.gov.br/handle/11449/223301-
Descrição: dc.descriptionA cost-effective strategy to obtain ultra-dense genomic information is to sequence part of population and perform imputation from lower density genotypes to sequence level for the remaining animals. The aims of this study were to evaluate the feasibility of genotype imputation from medium density to sequence level in Nile tilapia and to investigate the impacts of size and origin of reference population in the accuracy of imputation. Genomic DNA was extracted from fin-clip samples of 326 animals from 3 different populations (PA, PB and PC). After sequencing, alignment, variant calling and quality control of genotypes, approximately 4.6 million of single-nucleotide polymorphisms (SNPs) in common to all populations were retained and used for further imputation analyses. Four scenarios were evaluated to assess imputation accuracy on each population, including: two reference sizes (10 or 90% of animals of each reference population) and two reference origins (two different populations only or all three populations used as reference). The animals in the validation set had part of their genotypes masked keeping only 49,216 SNPs available and the accuracy of imputation was assessed using the correlation between the imputed and observed genotypes (R2). Imputation was carried out using FImpute3 software. At individual level, the R2 showed intermediate values ranging from 0.37 ± 0.04 to 0.56 ± 0.07 for PA, 0.43 ± 0.05 to 0.58 ± 0.08 for PB and 0.43 ± 0.05 to 0.58 ± 0.07 for PC. An increase in the R2 was observed when 90% of animals from the same population were used as reference in comparison to only 10% (0.37 ± 0.04 to 0.54 ± 0.07 for PA, 0.43 ± 0.05 to 0.57 ± 0.07 for PB and 0.43 ± 0.05 to 0.58 ± 0.07 for PC). At SNP level, the use of all three populations as reference yielded the best results in terms of number of SNPs imputed with accuracy greater than 0.8. On average, 676,233 ± 142,291, 666,559 ± 52,648 and 592,187 ± 89,663 SNPs were imputed with accuracy >0.8 for PA, PB and PC, respectively. Considering only these highly accurate imputed SNPs, the average imputation accuracy of samples was equal to 0.95 ± 0.06 for PA and 0.92 ± 0.07 for PB and PC, for scenarios that included more animals as reference (90% of same population as reference, two and three populations). There were no significant differences for R2 between scenarios that used 90% of animals from the same population and used animals from the three population as reference showing that the strategy of using information from other population to increase the reference population had minor effect on accuracy of imputation. In conclusion, it was feasible to impute from 50 K to approximately 700 K with high accuracy using tilapia sequence data. We also expect that the use of more animals from these populations or animals from ascending lines as reference could help in the imputation process to obtain millions of imputed SNPs with high accuracy.-
Descrição: dc.descriptionConselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)-
Descrição: dc.descriptionSchool of Agricultural and Veterinary Sciences UNESP - São Paulo State University-
Descrição: dc.descriptionFacultad de Ciencias Veterinarias y Pecuarias Universidad de Chile-
Descrição: dc.descriptionNational Council for Scientific and Technological Development (CNPq)-
Descrição: dc.descriptionSchool of Agricultural and Veterinary Sciences UNESP - São Paulo State University-
Idioma: dc.languageen-
Relação: dc.relationAquaculture-
???dc.source???: dc.sourceScopus-
Palavras-chave: dc.subjectGenotype imputation-
Palavras-chave: dc.subjectNile tilapia-
Palavras-chave: dc.subjectWhole-genome sequencing-
Título: dc.titleAccuracy of genotype imputation to whole genome sequencing level using different populations of Nile tilapia-
Tipo de arquivo: dc.typelivro digital-
Aparece nas coleções:Repositório Institucional - Unesp

Não existem arquivos associados a este item.