Quality control for genome-wide association studies pdf

Teoa,b introduction genome wide association study gwas is increasingly common as an experimental design for investigating the genetic basis of common diseases and complex traits in humans. Here we extend these methods and describe a system ofqcqa for genotypic data in genomewide association studies gwas. An important issue when creating a pedfile for qc analysis is the choice of strand orientation to use for allele calls i. This chapter overviews the quality control qc issues for snpbased genotyping methods used in genome wide association studies. Genomewide association and pathway analysis of carcass and. This protocol details the steps for data quality assessment and control that are typically carried out during casecontrol association studies. On quality control measures in genome wide association studies. Here, in the context of genome wide association studies and of minimizing the genome wide association studies.

In genetics, a genomewide association study gwa study, or gwas, also known as whole genome association study wga study, or wgas, is an observational study of a genomewide set of genetic variants in different individuals to see if any variant is associated with a trait. The main metrics for evaluating the quality of the genotypes are discussed followed by a worked out example of qc pipeline starting with raw data and finishing with a fully filtered dataset ready for downstream analysis. Marchini j, howie b, myers s, mcvean g, donnelly p. Here we extend these methods and describe a system of qcqa for genotypic data in genome. Metaanalysis of genomewide association studies and. The steps described involve the identification and. Quality control and conduct of genomewide association. Statistical methods to test for association in casecontrol gwa studies. Common statistical issues in genomewide association. Statistical analysis of genomewide association gwas data. Here, the authors report metaanalysis of genome wide association studies of flavor. Statistical methods to test for association in casecontrol gwa studies allele counting chisquare test logistic regression multiple testing and power example.

A test to assess the genotyping quality of individual probands in familybased association studies and an application to the hapmap data. To the best of our knowledge, this is the first comprehensive solution for secure quality control for metaanalysis of genome wide association studies. This article outlines the design and analysis of genetic association studies, but it focuses specifically on casecontrol studies in candidate genes or regions. The quality control qc filtering of single nucleotide polymorphisms snps is an important step in genomewide association studies to minimize potential false findings.

Genome wide association study an overview sciencedirect. We propose a transmission test that is based on this feature and that can be used. A new multipoint method for genomewide association studies by imputation of genotypes. Useful software packages for data management, quality control, and statistical analysis in genome wide association studies. Genome wide association and gene enrichment analysis. Gwastools brings the interactive capability and extensive statistical libraries of r to gwas. Quality control for genomewide association studies core. Data for genome wide association studies gwas demand a fair amount of preprocessing and quality control qc, especially snp genotypes. Automated quality control for genome wide association. Fardo dw, ionitalaza i, lange c 2009 on quality control measures in genome wide association studies. Quality control for genome wide association studies summary 1. On quality control measures in genomewide association. Data quality control in genetic casecontrol association. A genome wide association study gwas is a new approach that involves rapidly scanning several hundred thousand up to 5 millions markers across the complete sets of dna of many people to find genetic variations associated with a particular trait.

This paper provides details on the necessary steps to assess and control data in genome wide association studies gwas using genotype. Snp qc commonly uses expertguided filters based on qc variables e. Gwas for multiple sclerosis ms data cleaning quality control results. Revision has been made in the context of genomewide association studies gwass. A genomewide association study gwas is a new approach that involves rapidly scanning several hundred thousand up to 5 millions markers across the complete sets of dna of many people to find genetic variations associated with a particular trait. The need for careful attention to data quality has been appreciated for some time in this field, and a number of strategies for quality control and quality assurance qcqa have been developed. Allele transmissions in pedigrees provide a natural way of evaluating the genotyping quality of a particular proband in a familybased, genome wide association study. Jul 29, 2016 read the original article in full on fresearch. Aug 17, 2010 the need for careful attention to data quality has been appreciated for some time in this field, and a number of strategies for quality control and quality assurance qcqa have been developed. Genomewide association studies gwas have evolved over the last ten years into a powerful tool for investigating the genetic architecture of human disease. In this paper, we discuss a number of biostatistical aspects of gwas in detail. Data are stored in netcdf format to accommodate extremely large datasets that cannot fit within rs memory limits. On quality control measures in genomewide association studies. Genomewide association and pathway analysis of carcass.

The qc pipeline developed by the emerge network has enabled a thorough analysis of the quality of the genomewide genotype data generated on the 17,000 samples. Quality control for genomewide association studies request pdf. Genomewide association studies and genomic prediction pulls together expert contributions to address this important area of study. Automated quality control for genome wide association studies version 1. Genome wide association studies gwas have evolved over the last ten years into a powerful tool for investigating the genetic architecture of human disease. We specifically consider quality control issues and. Automated quality control for genome wide association studies read the latest article version. Pdf automated quality control for genome wide association.

Quality control for genome wide association studies cedric gondro, seung hwan lee, hak kyo lee and laercio r portoneto summary this chapter overviews the quality control qc issues for snpbased genotyping methods used in genome wide association studies. Quality control for genomewide association studies in humans. Genome wide association and pathway analysis of carcass and meat quality traits in piemontese young bulls volume 14 issue 2 s. The main metrics for evaluating the quality of the genotypes are. Quality control and quality assurance in genotypic data for.

Genomewide association studies march 14, 2012 karen mohlke, ph. Plink is a comprehensive, opensource commandline tool for genomewide association studies gwas and population genetics research 2. In genetics, a genome wide association study gwa study, or gwas, also known as whole genome association study wga study, or wgas, is an observational study of a genome wide set of genetic variants in different individuals to see if any variant is associated with a trait. Rapid and accurate haplotype phasing and missingdata inference for wholegenome association studies by use of localized haplotype. Hardyweinberg equilibrium, missing proportion msp and minor allele frequency maf to. In these genome wide association studies gwas, several hundreds of thousands of single nucleotide polymorphisms snps are analyzed at the same time, posing substantial biostatistical and computational challenges. Request pdf quality control for genomewide association studies this chapter overviews the quality control qc issues for snpbased genotyping. A test to assess the genotyping quality of individual probands in familybased association studies. Quality control for genomewide association studies.

Data quality control in genetic casecontrol association studies. Quality control procedures for genome wide association studies. Metaanalysis of genomewide association studies provides. A catalog of genomewide association studies full description of methods. Gwastools is an rbioconductor package for quality control and analysis of genomewide association studies gwas. Quality control for genome wide association studies. Pdf this paper provides details on the necessary steps to assess and control data in genome wide association studies gwas using genotype information. Quality control qc procedures for gwas are computationally intensive, operationally. Genomewide association studies gwas are being conducted at an unprecedented rate in populationbased cohorts and have increased our understanding of the pathophysiology of complex disease. Participants will learn about quality control and quality assurance steps of genomewide data, basic gwas analysis, construction of genetic risk scores, estimation of genomewide snp heritability, an introduction to familybased association approaches and an overview of metaanalytic techniques. This protocol provides guidelines for 1 organizational. Genome wide association studies only three gwa studies of sleeprelated phenotypes are currently available in humans. To the best of our knowledge, this is the first comprehensive solution for secure quality control for metaanalysis of genomewide association studies.

They all have a common aimto demonstrate the utility and draw attention of the r environment for statistical genetics or genetic. Genome wide association studies, quality control, illumina, r statistics 1. Request pdf quality control and quality assurance in genotypic data for genomewide association studies genomewide scans of. Design we conducted a metaanalysis of four genomewide association studies gwass encompassing 3771 cases and 5426 controls. These genomewide association studies focus on showing differences in the frequencies of variants between case and control groups, rather than cotransmission of a variant and disease through a family, as is done in linkage studies. Teoa,b introduction genomewide association study gwas is increasingly common as an experimental design for investigating the genetic basis of common diseases and complex traits in humans. Quality control procedures for genomewide association.

Fardo3 division of biomedical informatics, college of medicine, university of kentucky, lexington, ky, 40536, usa. Genomewide association studies and genomic prediction. Allele transmissions in pedigrees provide a natural way of evaluating the genotyping quality of a particular proband in a familybased, genomewide association study. Genomewide association studies gwas snp genotyping quality control statistics. Natural variations and genomewide association studies in. Weekly pubmed searches are done using the terms genomewide or genome and identification or genome and association, with limits on the current year and human status. Genomewide association and pathway analysis of carcass and meat quality traits in piemontese young bulls volume 14 issue 2 s. Genome wide association studies in practice risch and merikangas 1996 says that to detect a disease allele with a frequency of 0.

Even in this era of genomewide studies, casecontrol studies still form the majority of published reports. All of these data have been deposited in dbgap along with corresponding quality control documents that describe all of the qc details for each dataset individually. Statistical analysis of genomewide association gwas data jim stankovich. Quality control and quality assurance in genotypic data. Data for genome wide association studies gwas demand a fair amount of pre processing and quality control qc, especially snp genotypes. Here, the authors report metaanalysis of genomewide association studies of. Flavor is one of the most important traits for improving tomato sensory quality and consumer acceptability. Sullivan3 1 department of psychiatry, trinity college dublin, dublin, ireland 2 department of psychological medicine, school of medicine, cardi. Biostatistical aspects of genomewide association studies. Quality control for genomewide association studies in humans arne schillert, andreas ziegler introduction in their last issue in 2006, the news staff 2006 from science announced genomewide association gwa studies to be one of the areas to watch in 2007. Quality control for genomewide association studies springerlink.

Introduction data for genome wide association studies gwas demand a fair amount of preprocessing and quality control qc, especially snp genotypes. After targeted sequencing and functional annotation, we performed in vitro and in vivo experiments to confirm the functions of. It identifies probands with insufficient genotyping quality that were not removed by standard quality control filtering. Gwa studies are experiments in which numerous snps usually 1 or 2 million or more, currently across approximately 90% of the genome are genotyped in large populations. In the proposed secure quality control sqc, it guarantees that the analysts will receive nothing other than the final quality measurements. Aug 26, 2010 this protocol deals with the quality control qc of genotype data from genome wide and candidategene case control association studies, and outlines the methods routinely used in key studies from. Automated quality control for genome wide association studies. Pdf on quality control measures in genomewide association. Genome wide association studies and genomic prediction pulls together expert contributions to address this important area of study. Genomewide association study an overview sciencedirect. An important step in the analysis of genomewide association studies is the data cleaningqc filtering step. Automation is important as it reduces human errors and increases work efficiency.

Automated quality control for genome wide association studies read the latest article version by sally r. This chapter overviews the quality control qc issues for snpbased genotyping methods used in genomewide association studies. Useful software packages for data management, quality control, and statistical analysis in genomewide association studies. Quality control for genome wide association studies in humans arne schillert, andreas ziegler introduction in their last issue in 2006, the news staff 2006 from science announced genome wide association gwa studies to be one of the areas to watch in 2007. Fardo dw, ionitalaza i, lange c 2009 on quality control measures in genomewide association studies.

The qc pipeline developed by the emerge network has enabled a thorough analysis of the quality of the genome wide genotype data generated on the 17,000 samples. Rigorous organization and quality control qc are necessary to facilitate successful genomewide association metaanalyses gwamas of statistics aggregated across multiple genomewide association studies. Here we enumerate some of the challenges in qc of gwas data and describe the. Meat quality related phenotypes are difficult and expensive to measure and predict but are ideal candidates for genomic selection if genetic markers that account for a worthwhile proportion of the phenotypic variation can be identified. First, we will show how to apply rigorous quality control qc procedures on genotype data prior to conducting gwas, including the use of. In these genomewide association studies gwas, several hundreds of thousands of single nucleotide polymorphisms snps are analyzed at the same time, posing substantial biostatistical and computational challenges. Natural variations and genomewide association studies in crop plants. Common statistical issues in genome wide association studies. On quality control measures in genome wide association. Common statistical issues in genomewide association studies. Genome wide association studies, quality control and.

The volume begins with a section covering the phenotypes of interest as well as design issues for gwas, then moves on to discuss efficient computational methods to store and handle large datasets, quality control. Quality control and quality assurance in genotypic. Regardless of context, the practical utility of this information will ultimately depend upon the quality of the original data. Quality control procedures for genomewide association studies.

1330 1441 268 490 690 353 452 1512 1053 1081 1091 206 532 689 992 1144 1413 1595 929 1035 1472 941 1443 548 941 322 1529 592 1144 859 229 1366 310 770 622 112 311 47 858 896 742 239 1011