• Wheat Genotypic Datasets


    The MAB project has generated several types of genotypic data describing maize germplasm, including accessions from the CIMMYT Germplasm Bank (CGB) and pre-breeding materials generated from CGB materials. Data types include Single Nucleotide Polymorphisms (SNPs), Presence/Absence Variations (PAVs), and allele frequencies for thousands of markers. These data are released as a small number of key “datasets” with targeted subsets of germplasm and/or markers.

    Product details and features

    In general, very high-density genotypic data (with more than one million markers per sample) are released through Dataverse, whereas allele frequency and lower density SNP call datasets are typically released through Germinate.

    Dataverse Genotypic Datasets:

    1. A list of genotypic datasets available for download in the Dataverse repository provides links to individual studies.
    2. Studies contain fixed data files including the genotypic results file(s), and supporting files such as mapping files to link DNA sample names to germplasm identifiers, protocols for extraction or analysis, or other relevant documents.
    3. Each study is annotated with standard study-level metadata including: study title, description, authors, data generators, date of generation, keywords, links to related studies, links to relevant journal articles


    Back to catalogue


  • Germinate Genotypic Datasets:

    1. Users must first login to Germinate after registering free and agreeing to the terms of data sharing license.
    2. Users can then generate “”Groups”” of germplasm or markers of interest using one or more search or filtering tools and use them immediately to generate customized subsets of data and/or save them for future use
    3. SNP genotypic datasets in the Germinate data warehouse are available for direct download in a simple matrix format
    4. Users can select to export marker positions based on specific physical or genetic maps.
    5. Users can then download the selected genotypic data in a plain text format or as a “”Project”” immediately available for viewing in the Flapjack software.”


    Comments

    Some of the genotypic datasets, especially the very high-density data available in Dataverse, are very large. Long periods of time may be required to download them, particularly for people who have limited internet connectivity. Data can be provided in alternative ways if direct download from the internet is not possible. The large file sizes may also make it hard or impossible to work with them on computers with limited memory or using applications, such as Excel, that do not support extremely large files. Please contact Cimmyt-mab-seed@cgiar.org for additional help with accessing or working with any of these genotypic data files.

    Primary Users

    Researchers and students working in relevant fields such as biology, breeding, and bioinformatics, as well as wheat pre-breeders, molecular breeders, germplasm bank curators, and users of CIMMYT wheat germplasm bank materials


    Back to catalogue


  • Availability

    Genotypic Datasets in Dataverse:http://data.cimmyt.org/dvn/dv/seedsofdiscoverydvn/faces/StudyListingPage.xhtml?mode=1&collectionId=15

    Allele Frequency Genotypic Datasets in Germinate : http://germinate.seedsofdiscovery.org/wheat/#genotype-datasets
    Conjuntos de datos de SNP en Germinate: Disponible en Noviembre 2017.

    Molecular Maps in Germinate (http://germinate.seedsofdiscovery.org/wheat/#map-details)

     

    For more information

     


    Back to catalogue