Michael Major, Crop Trust
Banner photo: Georgina Smith/CIAT
Genesys, the global online portal to information about plant genetic resources for food and agriculture (PGRFA) in genebanks, has been significantly enhanced with the introduction of the Genesys Catalog for Phenotypic Datasets.
“Genesys has been publishing basic information about germplasm samples like source, origin and taxonomy for almost a decade. We call this ‘passport data’,” said Nora Castañeda-Álvarez, the Genesys Catalog Coordinator. “With the Genesys Catalog, we are enabling genebanks to publish additional information about their samples, like the size and shape of leaves or the color of seeds and flowers, or even yield or drought tolerance. These data can help users focus their requests for genetic material.”
Germany’s Federal Office for Agriculture and Food (BLE) has funded the two-year project to establish the Genesys Catalog. The project aims to make the wealth of characterization and evaluation data on genebank accessions accessible to plant breeders online via Genesys.
Passport, characterization and evaluation data are valuable for germplasm users like plant breeders, who can use them to select the genetic material they will request from genebanks. For example, say a breeder wants to develop a tomato variety which is tolerant to whitefly (Bemisia tabaci). It’s a pest that can devastate tomato production and is hard to fight with pesticides. She knows that the World Vegetable Center hosts almost 6,500 samples of tomato genetic resources, but which one to ask for?
To focus her germplasm search, the breeder could request materials collected in countries where both the crop and the pest are present, assuming that at least some varieties must have developed a certain degree of tolerance to the pest. However, whitefly affects tomatoes in many different countries, and therefore, our breeder would still end up with a huge set of samples for their research.
Now, with the Genesys Catalog, she can check the dataset Biotic stress resistance data of accessions of tomato wild relatives (Solanum spp.) accessions from the WorldVeg genebank collection. This contains data on 20 different plant traits associated with tolerance to whitefly for a range of wild tomato samples conserved at the World Vegetable Center. Based on this information, the breeder can identify the most appropriate two or three accessions to include in their particular breeding program.
The challenges of using data
Genebanks and researchers generate large volumes of data when they multiply seed, regenerate samples and screen plants for different characteristics. This can be vitally important to other researchers working on the same crops. But the data often isn’t disseminated widely and thus isn’t available to breeders who are searching for promising genetic resources.
The results of characterization and evaluation work are all too often kept only as hard copies, or files on someone’s hard-drive. Even when it happens, the sharing of such data can be of limited use since sometimes institutions will measure a traits differently, and the exact method is not well enough documented so that the results can be interpreted elsewhere.
The Genesys Catalog difference
The Genesys Catalog focuses squarely on these challenges. It allows researchers and institutions to document and maintain their own versions of the methodologies used for measuring each plant trait. The Genesys Catalog asks data providers to give a short and easy-to-understand title to each of their datasets, a description of the content, the locality where the work was performed, the list of researchers and technicians involved in the creation of the dataset, and the precise descriptors that were used for the germplasm characterization or evaluation.
When such “metadata” – or data about data – are recorded, it is easier for uses to discover the underlying characterization and evaluation datasets they need. Publishing the metadata alongside the actual datasets also helps preserve attribution, which is a factor of great concern to Genesys and its partners.
Genesys and the Genesys Catalog use Digital Object Identifiers (DOI) for genebank samples. The Secretariat of the International Treaty on Plant Genetic Resources for Food and Agriculture (ITPGRFA) provides DOIs as permanent unique identifiers for genetic resources at the sample level. The use of DOIs allows the phenotypic data registered in the Catalog to be linked with the physical materials conserved in genebanks.
Connecting to users
To kick off the Genesys Catalog, the Crop Trust team worked closely with staff from six genebanks to prepare, annotate and publish pilot characterization and evaluation datasets. The partners were: the World Vegetable Center in Taiwan, the Tropical Agricultural Research and Higher Education Center (CATIE) in Costa Rica, the Genetic Resources Research Institute (GeRRI) in Kenya, the Malaysian Agricultural Research and Development Institute (MARDI), the National Plant Genetic Resources Laboratory (NPGRL) in the Philippines and the National Genebank of Tunisia.
The Genesys team assisted these partners in adopting existing standards for data management and exchange. The team discussed procedures that would help partners maintain data integrity in time and thus prevent loss of data. The partners then developed, for the first time, internal standard operating procedures for preparing characterization and evaluation datasets for publication.
While preparing the datasets, the partners were also obliged to document such metadata as the creators involved in the making of the dataset and the location where the samples were sown and then measured. They also recorded data relevant to the interpretation and reuse of their datasets.
“If we can’t provide sufficient information, it will be difficult for users to gain access to the germplasm they seek,” said William Solano of CATIE, one of the partners who published pilot datasets. “The Genesys Catalog allows us to present data on what we have conserved in our genebank and share it with the international community. Both Genesys and our institutional database stores the passport data. But we would like to add more information that could be of interest to farmers and plant breeders.”
A Genesys Catalog Workshop was held from 5-7 March 2018 at the Crop Trust offices in Bonn, Germany. The six partners met with the Genesys team.
A Genesys Catalog Workshop was held from 5-7 March 2018 at the Crop Trust offices in Bonn, Germany. The six partners met with the Genesys team to discuss their progress. “This workshop proved to be extremely valuable for all of our partners to share their own experiences and also to get some hands-on training in uploading their data to Genesys,” said Matija Obreza, the Crop Trust’s Information Systems Manager.
“The workshop helped us to organize ourselves internally better and create procedures that allow us to capture data with optimal quality,” said Yassine Nahdi of the National Genebank of Tunisia. “Through our involvement in this project, we have learned what others have in their collections, what procedures they follow and, most importantly, to take steps to make these materials available to plant breeders, researchers, and farmers everywhere in the world.”
To date, 79 datasets have been published on Genesys from the six partners involved in the pilot project. “The lessons learned by the six pilot partners have paved the way for other genebanks to add their characterization and evaluation datasets to Genesys,” said Nora. “We are starting to extend our support to even CGIAR genebanks and others in publishing their datasets in the Genesys Catalog.”
The Genesys Catalog is a big step forward toward allowing users of plant genetic resources to efficiently find the material they need. As more genebanks publish datasets in the Catalog, it will prove to be an integral component in the global quest to better use our crop genetic resources.
The “Genesys Catalog for Phenotypic Datasets” project is supported
thanks to the contribution of the Federal Republic of Germany.