The sequencing methods are often selected and designed based on the individual marketing needs of the companies and the DNA data are processed by a different bioinformatics team in each company. For the majority of consumers who have had their DNA genotyped, the versions of genotyping microarray differ for different companies at different time periods. One of the main hurdles in utilizing consumer DNA data for research is that these data vary greatly in sequencing methods and, most importantly, the data quality. These people represent a huge potential group of participants for genetic research that is seeking to validate findings by a random or untargeted cohort, or trying to study the genetics for a particular phenotype under limited budget. Many of these consumers have shown interest in understanding their genomes better by taking the initiative in paying for third-party services like Xcode Life, CodeGen etc. These are people who have paid for sequencing, have ownership of their personal raw DNA data and the freedom to choose how they wish to use it. At the time of writing in Jan 2021, the two largest sequencing companies AncestryDNA and 23andMe officially report 30 million customers in total, , and 1.4 million and 1 million in research cohort size respectively for the Global Alliance for Genomics & Health (GA4GH), the international consortium for sharing genomic data. This group is under-exploited by the academic genetics community for research. We also provide for download the combined output for all OpenSNP array genomes processed in this paper in a single data freeze file.Ĭustomers of the direct-to-consumer (DTC) genotyping companies represent the majority of the population who have had their genome read. The GenomePrep output is available in two common DNA datafile formats to enable further analysis with other tools. An open source tool-kit to systematically parse, quality check and filter genome files and statistically problematic alleles is provided to prepare consumer DNA datasets for research. Thanks to the general public who shared their DNA data without constraint, here we provide a review for over 7000 genomes made public between 20, and produced by over six DTC sequencing companies. While the former benefits from meticulously designed sequencing standards and quality control procedures, the latter comes in various formats and sequencing methods which are subject to changes over time and the particular needs of different companies. One from medical research supported by governments and academic institutes the other from direct-to-consumer (DTC) sequencing companies. Two major forces have contributed to the fast growth of human genetic data.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |