Meeting minutes Polyploid Genomics Data Storage and Analysis CIP, Lima, Peru 8th - 10th May 2019

Date

May 8, 2019

Participants

  • @Elizabeth Jones

  • @Hannele Lindqvist-Kreuze

  • @Dorcus Gemenet

  • @Trushar Shah

  • @Catherine Breton

  • @Luis Augusto Becerra Lopez-Lavalle

  • @Ranjana Bhattacharjee

  • @Paterne AGRE

  • @Lindsay Clark

  • @endelman

  • @Marcelo Mollinari

  • @Tim Millar

  • @Moira Sheehan

  • @Antonio Augusto Franco Garcia

  • @hugo

  • @Umesh Rosyara

  • John Schoper, Leader; Genetics, Genomics and Crop Improvement Sciences Division

  • Hugo Campos, Director of Research

  • Noel Anglin, Leader for Conserving Biodiversity for the Future

  • Michael Freidman, Science Officer, RTB Program

  • Bert de Boeck, Statistician

  • Elisa Salas, Data Manager/Potato breeder

  • Ciro Rosales, System Analyst

  • Federico Diaz, Sweetpotato breeder

  • Wolfgang Gruneberg, Sweetpotato breeder

Goals

Determine needs for CGIAR polyploid crops

  • Determine priorities

  • Define action items

 

Presentations

https://cgiar-my.sharepoint.com/:f:/g/personal/a_collazos_cgiar_org/EtQCU5kryvtNo255SDg5yXYB5GwOmsmmoaBNMZgoIDfaMA?e=vjxVsL

Discussion topics

Genotyping data storage needs

Genotyping needs for vendors/marker delivery

Tools and analyses needs and demos

Bioinformatics resource and skill needs

Community development

Need

Solution

Questions

Responsible

Timing

Genotyping data storage needs

 

 

 

 

Potato - would like to be able to load genotyping data automatically

GOBii will load data directly from DArT and Intertek by end of year. But needs to develop solution for polyploids, unless use diploidized data.

Which crops are able to use diploidized data for now? Do crops know the standard data format they will use for genotyping?

Liz, Hannele, Elisa

Liz follow up with each crop on standard formats.

Liz work with GOBii on formats for tetrapploids and diploids, or posterior means.

If get extra funding, deploy within 1 year

Consolidated data (sweet potato) – now is in various places and formats

If have a GOBii data management system would help to consolidate and share data. Need to manage formats for sweet potato

As above

Liz, Dorcus, Elisa

As above

Organization of vcf, sequence files etc

GOBii is input data links that can help gather files in one place. Multiple versions of genotyping files can be stored under one project

 

Liz

Determine data storage needs by Q3 2019. Develop within 1 year

Sweet potato – possibly at limits of Breedbase/sweet potato base genotyping storage

GOBii!

 

Liz, Dorcus, Elisa

Within 1 year if get additional funding

Marker development and scoring needs

 

 

 

 

Sweet potato (and others). Need high depth sequencing to call polyploid genotypes eg 100X for hexaploids and maximize genomic predictions.

Eng request DArT cost/evaluate higher density sequencing. Can test with ploidy series of different crops.

Do we need high depth sequencing for haplotype methods or posterior mean genotypes? Or if we include linkage analysis?

Eng

Q3 2019

All crops – need DArT to provide vcf rather than 2 row format. Having a vcf file enables specific crop analysis based on ploidy levels or need for haplotypes

Eng talk to DArT about different file format options

So many possible vcf formats – can we agree as a group?

Eng follow up with DArT on ability to provide vcf

Within 2 months

Bananas Need Illumina GenTrain scoring to support triploids

?

Module 5 not currently supporting Illumina. Is there a contact bananas have to follow up?

?

 

Tools and analyses for polyploids

 

 

 

 

Vcf tools to be adapted for triploids (banana)

Marcelo may have solutions – Trushar to follow up

 

Trushar, Catherine and Marcelo

?

Yam needs to know ploidy levels of lines and flow cytometry data not matching karyotypes

SuperMASSA may be able to help with understanding ploidy levels. Ranjana will follow up with Marcelo

 

Ranjana and Marcelo

?

Vcf filtering tools needed for polyploids

Lindsay demonstrated R tool that can be used for filtering vcf polyploid files vcf_filter and we can share tool in community

 

Lindsay share with community

Q3 2019

Software for LD decay calculation (potato)

?

 

 

 

GWAS with GT probabilities (potato)

?

 

 

 

Upgrades to GWASpoly etc for multi-allelic haplotypes, populations with variant ploidy

Apply for funding -

Jeff will keep us updated on his USDA grant to develop Poly tools

Possibility to apply for a CGIAR grant? Also apply through SC7-B or CtEH funding?

 

Liz and Umesh look at SC7-B or CtEH funding

Q3 2019

File storage, organization and analysis capacity

 

 

 

 

Storage of data and computing resources needed (banana/IITA)

Shared server or services /amazon cloud computing?

Can this be managed/coordinated through EiB module 5?

 

Liz follow up with Kelly and Abhi

Within 1 month

Bioinformatics support / shared services for CGIAR community

Can this be managed/coordinated through EiB module 5?

 

Liz follow up with Kelly and Abhi

Within 1 month

 

 

 

 

 

Community and knowledge sharing

 

 

 

 

Consolidate available tools with comments on appropriate usage / reviews and example files in a central site

Can this be done on the EiB portal? We can do on the Confluence polyploid site for now.

 

Liz follow up with Kelly on EiB resources for links to tools.

Liz start collating tools on Confluence for now. Everyone else to add info

Q3 2019

Cassavava and wheat would benefit from ploidy tools to help with marker design/mapping traits

Bring wheat and Cassava communities into Ploidy working group

 

Liz

Q3 2019

Graphical interfaces eg like HIDAP to access tools (mainly R tools)

 

Apply for funding to do this?

 

Liz explore with Kelly and Abhi

Q3 2019

CGIAR community would like to understand more about ploidy data formats

Join ploidyverse community

 

Lindsay send out link/invite

Q2 2019

Training at different levels

Next meeting in Brazil?

 

For training/meetings need funding - EiB or another project eg RTB?

 

Can apply for developer exchange program through EiB module 5.

- Visit GOBii/Breedbase group

- Visit Lindsay’s group?

 

Liz explore with Kelly, Abhi and Kate, Dorcus explore with RTB

Developer exchange proposals by 5/17/19!

Q3 2019

A medium to share information digitally

Have Slack channel and public confluence site for now. Can try and find a shared medium on EiB portal

Share manuscripts, address problems,

 

Liz to engage community through Slack. All to contribute papers and discussion points through Slack and confluence,

Ongoing

Short write up for media

Most genomics tools and data storage methods are developed for diploid crops, but some of the most important crops in the world are polyploids such as potato, sweet potato, wheat and banana. The genomics of polyploids are complex, but new tools and methods are helping to better understand these complex genomes, as well as the genomes of crops that mostly behave like diploids, but who’s genomes show an ancient history of duplication such as wheat and cassava. Recently the Excellence in Breeding Program, module 5 funded a meeting that evaluated genomics resource needs for polyploid crops at CGIAR, and demonstrated new tools and methods in polyploid genomics. The meeting brought together experts from CGIAR and the public sector and generated exceedingly fruitful discussions, information exchanges, and plans for future collaborations. The participants resolved to share available polyploid tools in a common location, find solutions for accessing bioinformatics, server and cloud resources, meet with vendors to discuss polyploid genotyping needs, and find resources to improve polyploid tools. Most importantly, the participants all agreed on the importance of contributing towards the EiB polyploid community of practice, exchanging knowledge relating to the important topic of polyploid crops.

 

The meeting was led by Elizabeth Jones, director of GOBii (Genomic Open-source Breeding informatics initiative), and Adjunct Professor in Plant Breeding and Genetics, and hosted by Dorcus Gemenet and Hugo Campos, Director of Research, at the International Potato Center (CIP), Lima Peru. GOBii is funded by the Bill and Melinda Gates Foundation and resourced through The Institute of Biotechnology and The Boyce Thompson Institute at Cornell University.

 

 

 

Action items

@Elizabeth Jones to follow up on data storage formats for each crop and design solutions, within 1 year

@eng to follow up with vendors on sequencing levels required for polyploids, associated costs, and vendor formats for delivering data

@Ranjana Bhattacharjee to follow up with Marcelo on using SuperMASSA to predict ploidy in yam

@Trushar Shah to follow up with Marcelo on using tools for triploid banana

@Lindsay Clark to include polyploid group in ploidyverse community

@endelman to keep us updated on USDA proposal to update poly tools

@Umesh Rosyara to look at CGIAR funding that could be used to upgrade tools

@Elizabeth Jones to follow up with Kelly and Abhi on bioinformatics and server/cloud resources

@Elizabeth Jones to follow up with Kelly on common storage location for polyploid tools and possibly funding to develop interfaces eg like HIDAP

@Elizabeth Jones to engage community through slack

@Elizabeth Jones to add RTB/Next-gen community to Slack and Confluence

@Elizabeth Jones to start collating tools in Confluence for now

@all to reach out to contacts to find resources for next meeting RTB? in Brazil?

 

Decisions