Information management and services

SeeD is perhaps the most comprehensive genebank-characterization effort globally to this date. We are going to characterize the genetic makeup of up to 200,000 DNA samples from genebank accessions and introgression populations/lines. We will also characterize accession subsets of varying size and composition for a set of priority traits and breeding targets.

Such an ambitious project would be impossible without a solid foundation in two key enabling technology areas:

  • A robust, high-throughput genome-profiling platform that accurately characterizes unknown genetic diversity, is ‘forward compatible’ with evolving genotyping technologies, and enables the standardized analysis of accessions from different genebanks at different times.
  • A state-of-the-art software platform to collect, standardize, store, query, visualize, and analyze the ‘tsunami’ of data we expect SeeD participants to generate – ideally made of modules that can be repackaged so that SeeD clients can install and use these software tools themselves to more effectively utilize  SeeD data.

Given the critical nature of these enabling technologies, we have formed strategic partnerships with two organizations that have a proven track record of delivering solutions that are being readily adopted by their clientele, Diversity Arrays Technology (DArT) and the bioinformatics group at the James Hutton Institute (JHI).

The primary objective of our partnership with DArT is to establish and operate a joint Genetic Analysis Service (SAGA for its Spanish acronym: Servicio de Análisis Genético para la Agricultura) at the National Center for Genetic Resources (CNRG). SAGA will initially focus on analyzing DNA samples for SeeD and other MasAgro components, using genotyping-by-sequencing (GBS) technologies, while simultaneously building capacities in configuring modern genomics tools to fit applications in demand-driven agricultural research. In the medium term, we envision that SAGA operations should become self-sustaining to support applied agricultural research in Mexico and the region beyond the duration of the SeeD project and the crops targeted by SeeD by offering a combination of genetic-analysis and value-adding information-management services.

Information management presents a serious challenge for a project like SeeD. We are working with both DArT and JHI to co-develop a modular software platform for the SeeD project (SeeDB for ‘SeeD database’).

The entire SeeDB platform, including all its components and modules (for example for field data collection, will be made available for free. The code will be released under an Open Source license, starting with the first production version.

Project participants have developed a number of publicly available resources to facilitate the visualization and analysis of data.

Resource Provider Functionality Overview Download
Flapjack JHI Genotypic data visualization Overview Download
CurlyWhirly JHI PCA analysis visualization Overview Download
Helium JHI Pedigree and characterization data visualization Overview Download
Strudel JHI Genetic and physical maps visualization Overview Download
Tablet JHI Next gen sequence data visualization Overview Download
Humbug JHI Barcode generation Overview Download
Germinate JHI Phenotypic, genotypic, climate, and passport data storage and visualization Overview Download
Germinate 3 Data Importer JHI Data import into Germinate 3 Overview Download
META-R – 3.5.1 CIMMYT R programs for statistical analyses relevant to breeding Overview Download

Helium is an example of a useful visualization tool developed by SeeD collaborators that allows trait and characterization data to be painted onto a pedigree tree. This tool won the best paper award at BioVis 2014 in Boston, USA. A freely available video and a presentation demonstrate the use of this program.

For more information please contact us at: