2021-10-19 Meeting notes

Date

Oct 19, 2021

Participants

  • @Evan Rees

  • @Francisco Agosto

  • @Sebastian Raubach

  • @guilhem.sempere

  • @Dave Matthews

  • @Elizabeth Jones

  • @Moira Sheehan

  • @Yaw Nti-Addae

Goals

  • Review progress

  • Figure out how to handle polyploids

  • Figure out how to handle indels

  • Discuss cotton dataset

  • sample extraction

Discussion topics

Item

Notes

Item

Notes

Importing datasets

  • Distinguish between “unsupported” and “failed load”

  • Imports and exports should be replicated 3x

Potato dataset / polyploid

  • Should we keep this?

  • So small, may not be useful

  • Should be kept - autopolyploid

  • Alternate allele encoding

    • Tricky with Hapmap

      • need ad-hoc format to describe INDELs / polyploids

    • Standardized in VCF

  • Systems don’t necessarily maintain necessary data to encode a VCF

    • metadata / header not kept

  • MontyDB flattens polyploids in Hapmap output

    • flattened

      • chr_1 100 A/A

      • chr_1 100 A/C

      • chr_1 100 A/G

    • single line

      • chr_1 100 A/A/C/G

INDELs

  • Can INDELs be recoded as ‘i' and 'd’?

  • Standard format +/-

  • Hapmap doesn’t specify INDEL usage

  • Problem:

    • No standard separator is defined for hapmap

  • Solution

    • use / as separator for hapmap output

    • sub-problem

      • how does system resolve separator ambiguity?

      • e.g. is TTTT TT/TT or T/T/T/T?

use cases

  • Simplify use cases

    • Refine and flesh out

  • Focus on compare / contrast

    • Capture differences

  • We DON’T want to create development work for anyone

Action items

@guilhem.sempere investigate how Gigwa resolves ambiguity in Hapmap format
@Evan Rees open discussion on slack WRT timelines
@Dave Matthews Fill out benchmarks for lettuce_chr1, lettuce_full, rice_full
@Dave Matthews Load potato and report any errors / irregularities
@guilhem.sempere Fill out benchmarks for lettuce_chr1, lettuce_full, rice_full
@guilhem.sempere Load potato and report any errors / irregularities
@Sebastian Raubach Fill out benchmarks for lettuce_chr1, lettuce_full, rice_full
@Sebastian Raubach Load potato and report any errors / irregularities

Decisions

  1. Use / as separator for Hapmap format (both import and export)