Dataset Wizard

Definitions

Use the Dataset Wizard to load your allele data matrix.  The minimum fields you need are the dnarun name and marker names, but you can also load marker and sample metadata values if these are in your matrix file. 

Using the Dataset Wizard

  1. Select Wizards | Dataset Wizard.
  2. Use the fields in the left side menu to navigate to the appropriate dataset to load data to by first selecting the PI, then the Project, then the Experiment, and then the Dataset. The options that are available in each field are limited based on the options chosen in the previous fields. For example, only the datasets related to the previously selected experiment are available in the Dataset drop-down menu. Additional information about creating datasets is available on the Datasets page.

Selecting the file to load

  1. For local files, click Browse to browse to your file, or drag and drop the file into the file list box. If you select the wrong file, check the checkbox next to the file, then click Remove Selected File(s)
  2. For large files, load them from the files folder in your crop server environment. Create a new folder under the files folder, then copy the folder name to the Remote Path field. 
  3. If you are loading many files with the exact same format, create a template at the end of the wizard process, then save it for future use. Use templates with caution; they do not key off matching field names, but do key off row and column positions. Ensure that when you use a template, your metadata fields and data are in the exact same columns as when your template was made.
  4. Select the file format from the File Format drop-down menu. .txt and .csv files are supported for mapset, sample, marker and dataset files. .vcf and .hmp are also supported for dataset files; NOTE .vcf files will be converted to 2_letter_nucleotide format and so should be defined as such in the dataset_type field. The presence of 1 read of each allele will be sufficient to determine whether an allele is present (i.e. the number or ratio of reads, or the quality of reads does not affect the allele call). 
  5. Click Preview Data
  6. The top left hand section of your file is now viewable in the preview section. Expand the columns to view the headers. 
  7. Select the marker and sample positions (either TOP or LEFT)
  8. Select the first data coordinate i.e. the position of the first genotype call in the matrix. Sample and marker metadata will then be assumed to be above or to the left of this datapoint.
  9. Click Next to map data to the database fields.

Mapping Dataset Wizard Metadata

  1. For a dataset, the key fields that have to be mapped are the marker name to Dataset_marker name, and the dnarun_name to the dnarun_name field. 
  2. Metadata for the markers and samples can be loaded as for the Marker Wizard and Sample Wizard.  
  3. Click Finish to load the file to the database.