Genomic Crowdsourcing: Allele Frequency Community Provides Expansive, Ethnically Diverse, Freely Available Community Resource for Allele Frequency Annotation.

TitleGenomic Crowdsourcing: Allele Frequency Community Provides Expansive, Ethnically Diverse, Freely Available Community Resource for Allele Frequency Annotation.
Publication TypeConference Paper
Year of Publication2015
AuthorsBasset D, Boycott K, Bustamante CD, Cooper D, Eley G, Furmanski L, Glusman G, Goldstein D, Hegde M, Hieter P, Joecker A, Kaminski T, Kernohan K, Krämer A, Letovsky S, Levy S, Love T, Mason CE, Pearson N, Rehm H, Richards D, Rienhoff H, Schadt E, Shah S, Shendure J, Solomon BD, van der Spek P, Vockley JG, Yip R, Zhu X
Conference NameAmerican Society for Human Genetics
Date Published10/2015
Type of WorkAbstract
AbstractA key challenge in genome interpretation and precision medicine is the lack of an extensive, high quality, ethnically-diverse collection of human genomes as a reference set. A prospective disease-causing variant that appears to be “rare” based on publicly available sequence may in fact be a polymorphism in an ethnic population under-represented in public databases. Resources such as the Exome Variant Server, the 1000 Genomes Project, and the Exome Aggregation Consortium have been immensely valuable to the community, and Kaviar combines such datasets into integrated allele frequencies, but public databases have not been funded to provide broad and deep ethnic representation. QIAGEN’s Ingenuity Variant Analysis™ genome interpretation solution has been used to interpret hundreds of thousands of ethnically diverse human sequencing samples. However, these NGS datasets are private and most are never publicly released. The Allele Frequency Community (www.allelefrequencycommunity.org) has been formed to address this interpretation need. Community members have pooled extensive human exome- and genome-wide variant call datasets in a secure, anonymized, pooled fashion to create the largest integrated, freely-accessible, hosted community database of allele frequencies ever available. More than 100,000 human exome- and genome-wide variant call datasets, including over 13,500 whole genomes, are already included in the Allele Frequency Community. The database is richly ethnically diverse, representing over 100 countries of origin and has been shown in benchmarking studies to significantly decrease the false positive rate in disease causing variants identification. To enable this community resource to grow over time, users have the opportunity to opt-in to join the Allele Frequency Community and benefit from the extensive database, agreeing in return to allow their sequences to contribute to the database. Only anonymous, pooled allele frequencies are provided, protecting patient privacy. QIAGEN Bioinformatics agreed to host the content and make it available free of charge via its HIPAA and Safe Harbor compliant genome interpretation ecosystem, which includes QIAGEN’s Ingenuity Variant Analysis, CLC Biomedical Research Workbench and Clinical Insight offerings. Application of this new resource to clinical sequencing cases will be presented.