General Information

Contents

Genotyping data management systems

System

Group

Contact

VM Hostname

phase

System

Group

Contact

VM Hostname

phase

Germinate

JHI

@Sebastian Raubach

cbsugobiizvm21.biohpc.cornell.edu

one

GDM

Cornell

GOBii

@Dave Matthews

@Evan Rees

cbsugobiizvm23.biohpc.cornell.edu

one

Gigwa

CIRAD

@guilhem.sempere

cbsugobiizvm19.biohpc.cornell.edu

one

Breedbase

BTI

@Titima Tantikanjana

cbsugobiizvm20.biohpc.cornell.edu

TWO

MontyDB

Cornell

McCouch Lab

@Francisco Agosto

 

TWO

BCFTools

Broad Institute

 

 

TWO

PHG

Cornell

Buckler Lab

Ask Ed

 

HOLD

Breeding Insight

Breeding Insight

@Moira Sheehan

 

HOLD

GDR-BIMS

University of Washington

Dori?

 

HOLD

IPK

 

Patrick

 

HOLD

VM allocations

VM Hostname

Status

Server Pool

Assignment

username

VM Hostname

Status

Server Pool

Assignment

username

cbsugobiizvm20.biohpc.cornell.edu

off

cbsugobii09

Breedbase

breedbase

cbsugobiizvm23.biohpc.cornell.edu

on

cbsugobii09

GDM

gadm

cbsugobiizvm19.biohpc.cornell.edu

on

cbsugobii10

Gigwa

gigwa

cbsugobiizvm22.biohpc.cornell.edu

off

cbsugobii10

PHG

phg

 

 

 

 

 

cbsugobiizvm21.biohpc.cornell.edu

on

cbsugobii11

Germinate

jhi

 

off

cbsugobii11

MontyDB

montydb

Each VM has the following resources:

  • 8 CPUs

  • 64 GB RAM

  • 2 TB SSD

  • /storage mounted volume

  • /shared_data mounted volume

Datasets

Dataset

Format

Location

Dataset

Format

Location

Maize NAM

CSV

/shared_data/test_data/NAM_HM32/csv

Simulated datasets

 

 

polyploid data in VCF

 

Moira share a dataset - invite to next meeting

indel data

 

 

rice high density array

vcf

The Rice High Density Array is : 700K SNPs x ~1500 samples

SNPs only

vcf too

Francisco loaded to Gigwa (own instance) already no problem

http://rs-bt-mccouch4.biotech.cornell.edu/staged_data/CSHL_EVA_Release_HDRA.tar

Hapmap: cbsugobiizvm19:/shared_data/test_data/genomics-systems-comparison/rice/Dataset.hmp.txt
Flapjack: cbsugobiizvm19:/shared_data/test_data/genomics-systems-comparison/rice/flapjack/Dataset.*

African rice

 

https://gigwa.ird.fr/gigwa/?module=AfricanRice

available as vcf

metadata availability?

 

3,000 rice genomes

 

too large? 29M SNPs

lettuce Wageningen

Public dataset

vcf

12M markers x 500 accessions

3 vcfs - one SNPs, one indels, one structural variants

40 GBs

https://www.nature.com/articles/s41588-021-00831-0/pub/CNSA/data2/CNP0000335/Other/variation
ftp.cngb.org/pub/CNSA/data2/CNP0000335/Other/variation

Lettuce

hapmap

flapjack

/shared_data/test_data/genomics-systems-comparison/lettuce/ chr1/ Lactuca__project1__2021-06-24__1152198variants__FLAPJACK.fjzip Lactuca__project1__2021-06-24__1152198variants__HAPMAP.zip markerlists.zip full/ Lactuca__project1__2021-06-28__12983735variants__FLAPJACK.fjzip Lactuca__project1__2021-06-28__12983735variants__HAPMAP.zip

potato (polyploid)

VCF

/shared_data/test_data/genomics-systems-comparison/potato/ PRJNA414303.CHR5.filterNullGT.vcf.gz

source