Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Contents

Table of Contents
minLevel1
maxLevel7
excludeContents

Genotyping data management systems

System

Group

/Institution

Contact

VM Hostname

phase

Germinate

JHI

Sebastian

MontyDB

Cornell

Francisco

GDM

Cornell

Joel

Raubach

cbsugobiizvm21.biohpc.cornell.edu

Status
colourPurple
titleone

GDM

Cornell

GOBii

Dave Matthews

Evan Rees

cbsugobiizvm23.biohpc.cornell.edu

Status
colourPurple
titleone

Gigwa

CIRAD

guilhem.sempere

cbsugobiizvm19.biohpc.cornell.edu

Status
colourPurple
titleone

Breedbase

BTI

Tetima

Gigwa

Titima Tantikanjana

cbsugobiizvm20.biohpc.cornell.edu

Status
colourYellow
titleTWO

MontyDB

Cornell

McCouch Lab

Francisco Agosto

Status
colourYellow
titleTWO

BCFTools

Broad Institute

Status
colourYellow
titleTWO

PHG

Cornell

Buckler Lab

Ask Ed

BCF

Broad Institute

Status
titleHOLD

Breeding Insight

Breeding Insight

Moira Sheehan

Status
titleHOLD

GDR-BIMS

University of Washington

Dori?

Status
titleHOLD

IPK

Patrick

Status
titleHOLD

VM allocations

VM Hostname

Status

Server Pool

Assignment

username

cbsugobiizvm20.biohpc.cornell.edu

Status
colourRed
titleoff

cbsugobii09

Breedbase

Germinate

breedbase

cbsugobiizvm23.biohpc.cornell.edu

Status
colourGreen
titleon

cbsugobii09

GDM

gadm

cbsugobiizvm19.biohpc.cornell.edu

Status
colourGreen
titleon

cbsugobii10

Gigwa

gigwa

cbsugobiizvm22.biohpc.cornell.edu

Status
colourRed
titleoff

cbsugobii10

PHG

MontyDB

phg

cbsugobiizvm21.biohpc.cornell.edu

Status
colourGreen
titleon

cbsugobii11

Breedbase

Germinate

jhi

Status
colourRed
titleoff

cbsugobii11

PHG

Username

User

gadm

system

yaw

dave

francisco

MontyDB

Users

montydb

Each VM has the following resources:

  • 8 CPUs

  • 64 GB RAM

  • 2 TB SSD

  • /storage mounted volume

  • /shared_data mounted volume

Datasets

Dataset

Format

Location

Maize NAM

CSV

/shared_data/test_data/NAM_HM32/csv

Simulated datasets

polyploid data in VCF

Moira share a dataset - invite to next meeting

indel data

rice high density array

vcf

The Rice High Density Array is : 700K SNPs x ~1500 samples

SNPs only

vcf too

Francisco loaded to Gigwa (own instance) already no problem

http://rs-bt-mccouch4.biotech.cornell.edu/staged_data/CSHL_EVA_Release_HDRA.tar

Hapmap: cbsugobiizvm19:/shared_data/test_data/genomics-systems-comparison/rice/Dataset.hmp.txt
Flapjack: cbsugobiizvm19:/shared_data/test_data/genomics-systems-comparison/rice/flapjack/Dataset.*

African rice

https://gigwa.ird.fr/gigwa/?module=AfricanRice

available as vcf

metadata availability?

3,000 rice genomes

too large? 29M SNPs

lettuce Wageningen

Public dataset

vcf

12M markers x 500 accessions

3 vcfs - one SNPs, one indels, one structural variants

40 GBs

https://www.nature.com/articles/s41588-021-00831-0/pub/CNSA/data2/CNP0000335/Other/variation
ftp.cngb.org/pub/CNSA/data2/CNP0000335/Other/variation

Lettuce

hapmap

flapjack

Code Block
languagebash
/shared_data/test_data/genomics-systems-comparison/lettuce/
  chr1/
    Lactuca__project1__2021-06-24__1152198variants__FLAPJACK.fjzip
    Lactuca__project1__2021-06-24__1152198variants__HAPMAP.zip
    markerlists.zip
  full/
    Lactuca__project1__2021-06-28__12983735variants__FLAPJACK.fjzip
    Lactuca__project1__2021-06-28__12983735variants__HAPMAP.zip

potato (polyploid)

VCF

Code Block
languagebash
/shared_data/test_data/genomics-systems-comparison/potato/
  PRJNA414303.CHR5.filterNullGT.vcf.gz

source