...
Dataset | Format | Location |
---|---|---|
Maize NAM | CSV | /shared_data/test_data/NAM_HM32/csv |
Simulated datasets | ||
polyploid data in VCF | Moira share a dataset - invite to next meeting | |
indel data | ||
rice high density array | vcf | The Rice High Density Array is : 700K SNPs x ~1500 samples SNPs only vcf too Francisco loaded to Gigwa (own instance) already no problem http://rs-bt-mccouch4.biotech.cornell.edu/staged_data/CSHL_EVA_Release_HDRA.tar |
African rice | https://gigwa.ird.fr/gigwa/?module=AfricanRice available as vcf metadata availability? | |
3,000 rice genomes | too large? 29M SNPs | |
lettuce Wageningen Public dataset | vcf | 12M markers x 500 accessions 3 vcfs - one SNPs, one indels, one structural variants 40 GBs https://www.nature.com/articles/s41588-021-00831-0 /pub/CNSA/data2/CNP0000335/Other/variation |
...
Start with overview of features so we can understand better benchmarking
Action items April 21st29th
All - check can access site and load database - Gigwa still to be loaded to VM. Guilhem can access site but needs a user name
...