Datasets

Definitions

In a Dataset, you can define the suite of analyses applied to the data to generate the dataset genotyping matrix

Field Descriptions

  • Dataset Name: Mandatory field for the name of the dataset. The dataset name must be unique within an experiment.  
  • Dataset Type: Mandatory field describing the "type" of data within the dataset which refers to the text format of the genotypic data calls such as IUPAC or 2-letter nucleotide. For additional information see Supported Dataset Types. The dataset types can also be viewed by clicking on Define | Controlled Vocabulary | dataset_type. Dataet_types are populated with system-generated values used by the software - these should not be modified by users. For additional information, refer to Controlled Vocabulary.
  • Calling Analysis: Mandatory field describing the calling analysis applied to the genotyping data to produce the dataset. Define the calling analysis in Define | Analyses. 
    • Calling Analysis is a required field; if no calling analysis is applied to the genotyping data, then define a generic calling analysis. For example, "none" or "no calling."
  • Data File:  A system generated file path for the HDF5 file generated after the dataset is uploaded. The curator is not responsible for loading any information to this field.
  • Data Table: A system generated name for the uploaded dataset. The curator is not responsible for loading any information to this field.
  • Analyses:  Optionally add additional analyses that were applied to the data to generate the dataset. Define these analyses in Define | Analyses.

Add a New Dataset to an Experiment

  1. From the menu bar, select Create | Datasets. A list of existing datasets display on the left-hand side of the page. Export exports a text file of the datasets listed.
  2. Using the drop-down menu on the left-hand side of the page, select an experiment in which to add your dataset. Any existing datasets previously created for the selected experiment are displayed. Alternately, if you previously selected an experiment in Create | Experiments, datasets is already appropriately filtered. For additional information, refer to Experiments.
  3. Enter a unique dataset name in the Dataset Name field within the experiment.
  4. Select a dataset type from the Dataset Type drop-down menu. For additional information see Supported Dataset Types.
  5. Select a calling analysis from the Calling Analysis drop-down menu. Define the calling analysis in Define | Analyses.
  6. Add optional additional analyses by checking the pre-defined Analyses checkboxes.
  7. Click Add New. The new dataset name displays on the left-hand side of the page.
  8. Alternately, you can create a new dataset by editing an existing dataset, then clicking Add New, as long you assign a new dataset name. Or, click Clear Fields in an existing Dataset, edit the applicable fields, then click Add New to create a new dataset. Clicking Clear Fields does not delete an existing dataset entry. When you edit the empty fields, then click Add New, a new dataset is generated. 


Update Metadata for a Dataset

  1. From the menu bar, select Create | Datasets. A list of existing datasets display on the left-hand side of the page.
  2. Filter the list of datasets by experiment using the Select Experiment drop-down menu.  
  3. Select the datasets you want to update. Details relating to the datasets display on the right-hand side of the page. Edit any applicable fields. You can also change the Dataset Name, as any associated entities remain associated.
  4. Click Update. The requested changes are made.

Upload a Dataset with the Dataset Wizard

After creating an analysis dataset, the user can load the dataset using the Dataset Wizard button on the bottom right menu**. However, datasets do not have to be added at this time, and can be added later using the Dataset Wizard accessed directly from the menu bar. For additional information, refer to Dataset Wizard.

** Please note that in many cases, it may be necessary or useful to first upload data about germplasm, samples, and/or markers, before proceeding to load a Dataset. Please see the Marker Wizard and DNA Sample Wizard for more information.