Meeting minutes Polyploid Genomics Data Storage and Analysis CIP, Lima, Peru 8th - 10th May 2019
Date
May 8, 2019
Participants
@Elizabeth Jones
@Hannele Lindqvist-Kreuze
@Dorcus Gemenet
@Trushar Shah
@Catherine Breton
@Luis Augusto Becerra Lopez-Lavalle
@Ranjana Bhattacharjee
@Paterne AGRE
@Lindsay Clark
@endelman
@Marcelo Mollinari
@Tim Millar
@Moira Sheehan
@Antonio Augusto Franco Garcia
@hugo
@Umesh Rosyara
John Schoper, Leader; Genetics, Genomics and Crop Improvement Sciences Division
Hugo Campos, Director of Research
Noel Anglin, Leader for Conserving Biodiversity for the Future
Michael Freidman, Science Officer, RTB Program
Bert de Boeck, Statistician
Elisa Salas, Data Manager/Potato breeder
Ciro Rosales, System Analyst
Federico Diaz, Sweetpotato breeder
Wolfgang Gruneberg, Sweetpotato breeder
Goals
Determine needs for CGIAR polyploid crops
Determine priorities
Define action items
Presentations
Discussion topics
Genotyping data storage needs |
Genotyping needs for vendors/marker delivery |
Tools and analyses needs and demos |
Bioinformatics resource and skill needs |
Community development |
Need | Solution | Questions | Responsible | Timing |
Genotyping data storage needs |
|
|
|
|
Potato - would like to be able to load genotyping data automatically | GOBii will load data directly from DArT and Intertek by end of year. But needs to develop solution for polyploids, unless use diploidized data. | Which crops are able to use diploidized data for now? Do crops know the standard data format they will use for genotyping? | Liz, Hannele, Elisa | Liz follow up with each crop on standard formats. Liz work with GOBii on formats for tetrapploids and diploids, or posterior means. If get extra funding, deploy within 1 year |
Consolidated data (sweet potato) – now is in various places and formats | If have a GOBii data management system would help to consolidate and share data. Need to manage formats for sweet potato | As above | Liz, Dorcus, Elisa | As above |
Organization of vcf, sequence files etc | GOBii is input data links that can help gather files in one place. Multiple versions of genotyping files can be stored under one project |
| Liz | Determine data storage needs by Q3 2019. Develop within 1 year |
Sweet potato – possibly at limits of Breedbase/sweet potato base genotyping storage | GOBii! |
| Liz, Dorcus, Elisa | Within 1 year if get additional funding |
Marker development and scoring needs |
|
|
|
|
Sweet potato (and others). Need high depth sequencing to call polyploid genotypes eg 100X for hexaploids and maximize genomic predictions. | Eng request DArT cost/evaluate higher density sequencing. Can test with ploidy series of different crops. | Do we need high depth sequencing for haplotype methods or posterior mean genotypes? Or if we include linkage analysis? | Eng | Q3 2019 |
All crops – need DArT to provide vcf rather than 2 row format. Having a vcf file enables specific crop analysis based on ploidy levels or need for haplotypes | Eng talk to DArT about different file format options | So many possible vcf formats – can we agree as a group? | Eng follow up with DArT on ability to provide vcf | Within 2 months |
Bananas Need Illumina GenTrain scoring to support triploids | ? | Module 5 not currently supporting Illumina. Is there a contact bananas have to follow up? | ? |
|
Tools and analyses for polyploids |
|
|
|
|
Vcf tools to be adapted for triploids (banana) | Marcelo may have solutions – Trushar to follow up |
| Trushar, Catherine and Marcelo | ? |
Yam needs to know ploidy levels of lines and flow cytometry data not matching karyotypes | SuperMASSA may be able to help with understanding ploidy levels. Ranjana will follow up with Marcelo |
| Ranjana and Marcelo | ? |
Vcf filtering tools needed for polyploids | Lindsay demonstrated R tool that can be used for filtering vcf polyploid files vcf_filter and we can share tool in community |
| Lindsay share with community | Q3 2019 |
Software for LD decay calculation (potato) | ? |
|
|
|
GWAS with GT probabilities (potato) | ? |
|
|
|
Upgrades to GWASpoly etc for multi-allelic haplotypes, populations with variant ploidy | Apply for funding - Jeff will keep us updated on his USDA grant to develop Poly tools Possibility to apply for a CGIAR grant? Also apply through SC7-B or CtEH funding? |
| Liz and Umesh look at SC7-B or CtEH funding | Q3 2019 |
File storage, organization and analysis capacity |
|
|
|
|
Storage of data and computing resources needed (banana/IITA) Shared server or services /amazon cloud computing? | Can this be managed/coordinated through EiB module 5? |
| Liz follow up with Kelly and Abhi | Within 1 month |
Bioinformatics support / shared services for CGIAR community | Can this be managed/coordinated through EiB module 5? |
| Liz follow up with Kelly and Abhi | Within 1 month |
|
|
|
|
|
Community and knowledge sharing |
|
|
|
|
Consolidate available tools with comments on appropriate usage / reviews and example files in a central site | Can this be done on the EiB portal? We can do on the Confluence polyploid site for now. |
| Liz follow up with Kelly on EiB resources for links to tools. Liz start collating tools on Confluence for now. Everyone else to add info | Q3 2019 |
Cassavava and wheat would benefit from ploidy tools to help with marker design/mapping traits | Bring wheat and Cassava communities into Ploidy working group |
| Liz | Q3 2019 |
Graphical interfaces eg like HIDAP to access tools (mainly R tools)
| Apply for funding to do this? |
| Liz explore with Kelly and Abhi | Q3 2019 |
CGIAR community would like to understand more about ploidy data formats | Join ploidyverse community |
| Lindsay send out link/invite | Q2 2019 |
Training at different levels | Next meeting in Brazil?
For training/meetings need funding - EiB or another project eg RTB?
Can apply for developer exchange program through EiB module 5. - Visit GOBii/Breedbase group - Visit Lindsay’s group? |
| Liz explore with Kelly, Abhi and Kate, Dorcus explore with RTB Developer exchange proposals by 5/17/19! | Q3 2019 |
A medium to share information digitally | Have Slack channel and public confluence site for now. Can try and find a shared medium on EiB portal Share manuscripts, address problems, |
| Liz to engage community through Slack. All to contribute papers and discussion points through Slack and confluence, | Ongoing |
Short write up for media
Most genomics tools and data storage methods are developed for diploid crops, but some of the most important crops in the world are polyploids such as potato, sweet potato, wheat and banana. The genomics of polyploids are complex, but new tools and methods are helping to better understand these complex genomes, as well as the genomes of crops that mostly behave like diploids, but who’s genomes show an ancient history of duplication such as wheat and cassava. Recently the Excellence in Breeding Program, module 5 funded a meeting that evaluated genomics resource needs for polyploid crops at CGIAR, and demonstrated new tools and methods in polyploid genomics. The meeting brought together experts from CGIAR and the public sector and generated exceedingly fruitful discussions, information exchanges, and plans for future collaborations. The participants resolved to share available polyploid tools in a common location, find solutions for accessing bioinformatics, server and cloud resources, meet with vendors to discuss polyploid genotyping needs, and find resources to improve polyploid tools. Most importantly, the participants all agreed on the importance of contributing towards the EiB polyploid community of practice, exchanging knowledge relating to the important topic of polyploid crops.
The meeting was led by Elizabeth Jones, director of GOBii (Genomic Open-source Breeding informatics initiative), and Adjunct Professor in Plant Breeding and Genetics, and hosted by Dorcus Gemenet and Hugo Campos, Director of Research, at the International Potato Center (CIP), Lima Peru. GOBii is funded by the Bill and Melinda Gates Foundation and resourced through The Institute of Biotechnology and The Boyce Thompson Institute at Cornell University.
|
|
Action items
@Elizabeth Jones to follow up on data storage formats for each crop and design solutions, within 1 year
@eng to follow up with vendors on sequencing levels required for polyploids, associated costs, and vendor formats for delivering data
@Ranjana Bhattacharjee to follow up with Marcelo on using SuperMASSA to predict ploidy in yam
@Trushar Shah to follow up with Marcelo on using tools for triploid banana
@Lindsay Clark to include polyploid group in ploidyverse community
@endelman to keep us updated on USDA proposal to update poly tools
@Umesh Rosyara to look at CGIAR funding that could be used to upgrade tools
@Elizabeth Jones to follow up with Kelly and Abhi on bioinformatics and server/cloud resources
@Elizabeth Jones to follow up with Kelly on common storage location for polyploid tools and possibly funding to develop interfaces eg like HIDAP
@Elizabeth Jones to engage community through slack
@Elizabeth Jones to add RTB/Next-gen community to Slack and Confluence
@Elizabeth Jones to start collating tools in Confluence for now
@all to reach out to contacts to find resources for next meeting RTB? in Brazil?