G-DIRT: The germplasm duplicate identification and removal tool
Several genebanks have been established across the world to conserve the plant genetic resources for various breeding programs and research activities. Existence of duplicates among the germplasm accessions makes the task of PGR conservation chaotic. Further, maintaining duplicate germplasms in the genebank is not an effective method of PGR conservation. Identification and elimination of duplicate germplasms from genebanks decreases the cost of management and enhances the effective management and utilization of genebank resources. The data generated by genotypic characterization not only help identify duplicates within and among genebanks but also assist in the characterization of the genomic diversity, and imputation of missing passport information. Thus, the G-DIRT (Germpalsm Duplicate Identification and Removal Tool) is developed to identify and remove the duplicate germplasms based on the genotypic SNP data.
The G-DIRT web server has been developed under the project “Germplasm genomics for trait discovery” under DBT funded network project “Germplasm Characterization and Trait Discovery in Wheat using Genomics Approaches and its Integration for Improving Climate Resilience, Productivity and Nutritional quality” at ICAR-National Bureau of Plant Genetic Resources, New Delhi-110012. Besides duplicate identification and removal, the web server also allows data pre-processing based on the parameters like minor allele frequency, missing genotype data, linkage disequilibrium prunning, Hardy Weinberg's equilibrium, marker heterozygosity. The duplicate identification and removal can be achieved two options: (i) total genotypic difference and (ii) homozygous difference which are estimated through Indentical By State (IBS) analysis. Initially, the server was designed to identify the duplicates in ~7000 wheat accessions based on genotypic data. However it can also be used to identify and remove duplicates in other crops.