Supported Dataset Types

Two Letter Nucleotide

SNP values allowed: AA, CC, GG, TT, AC, AG, AT, CG, CT, GT, CA, GA, TA, GC, TC, TG

Indel values allowed: ++, --, -+, +- 

Allele order as entered is stored 'as-is' in the system and upon extract.

Missing data allowed = NN or ? or 0. ? and 0 are converted to NN upon loading. 

A dash  "-" sometimes used for missing data must be manually replaced with NN, as this conflicts with the indel values.



If needed, additional missing values can be added to the missingIndicators.txt file found in <gobii root directory> gobii_bundle/loaders/etc

Currently these are:

NTC
Uncallable
Unknown
Unreadable



Note: The missing values listed in the missingIndicator.txt file only apply for 2 letter nucleotide dataset type formats and will not be converted for other dataset types



VCF file formats are converted to 2_letter_nucleotide upon loading and so should be defined as such

IUPAC

IUPAC calls are converted to the 2 letter nucleotide format upon loading

IUPAC to Biallelic Transposition is described here: IUPAC to Biallelic Transposition

IUPAC allowed nucleotide characters are described here:  http://www.bioinformatics.org/sms/iupac.html

IUPAC allowed nucleotide characters as defined in TASSEL: https://bitbucket.org/tasseladmin/tassel-5-source/wiki/UserManual/Appendix/NucleotideCodes

Codominant

Values allowed: 0,1,2

  • 1 is the heterozygote.

  • 0 can be used for the absence of the trait positive value.

  • 2 can be used for presence of the trait positive value.

Missing data value = N

Dominant

Values allowed: 0,1

  • 0 can be used for absence of the allele.

  • 1 can be used for presence of the allele (in the homozygous or heterozygous form).

Missing data value = N

SSR Allele Size

SSR alleles must be converted to eight-digit values before loading to GOBii.

For example, allele sizes of 123/125 must be converted to 01230125.

Missing values 00000000.

Use caution when converting from Microsoft Excel. It will convert your 00000000 values to 0 and strip off extra 0s.