Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Definitions

...

Germplasm Metadata

Primary fields

  • *name:  The A required field describing the germplasm name. For example, the name most commonly used, or the default name. The germplasm table has a one to many relationship with the dnasample table. For example, many dnasamples can be associated with a single germplasm name.
  • *external_code:  The code used in A required code to describe the unit of material from which the sample was generated. This is most likely a PlotID or a PLantID. The code should be meaningful in an adjacent germplasm or sample tracking database systems. For example, this could be GID, PlotID, or SampleID. Each germplasm name must have a unique external_codename can have several external_codes, but each external code cannot be linked to more than one germplasm_name.
  • species_name:  The An optional field for species name. This is a CV term, and must be added in the CV tables first  Define|Controlled Vocabulary to maintain naming consistency and it is case sensitive. Therefore, the . The species_name in the germplasm template input file should match exactly with the CV table of the entry in germplasm_species.   
  • type_name: The type  An optional field for type, or generation, of the germplasm. For example,  accessionthe germplasm could be an accession, inbred_line, f1_hybrid, f2, f3, f4, f5, etc. It is also case sensitive This is a CV term, and must be added in Define|Controlled Vocabulary to maintain naming consistencyTherefore, the type_name in the germplasm template input file should match exactly with the CV table of the germplasm_type.

Properties (all optional)

  • germplasm_heterotic_groupThe germplasm groups within species. For example, NSS, SSS, A, or B for maize.
  • germplasm_idA higher level of ID. For example, MGID.
  • germplasm_subspSub species grouping of germplasm. This could be different for each crop, but, for example, would be dent, flint, sweet, or pop for maize, indica, and japonica for rice, breed wheat, durum wheat for wheat etc.
  • par1The germplasm name for parent 1 of the germplasm (in a biparental cross this would be the female).
  • par2The germplasm name for parent 2 of the germplasm  (in a biparental cross this would be the male).
  • par3The germplasm name for parent 3 of the germplasm.
  • par4The germplasm name for parent 4 of the germplasm.
  • pedigreeThe pedigree for the germplasm name.
  • seed_source_idSeed source ID for the germplasm.

...

DNASample Metadata

Primary fields

  • *name:  The name of the sample that A required field to describe the sample name. This is usually the name that gets sent to the lab for processing. We recognize that there are multiple levels of samples that can be tracked by a lab or LIMS system (batches/bags/sub samples, etc.), but for our purposes, we assume that the sample is in a plate or in tubes ready for processing in the laboratory so that the allele data can be connected back at the sample level within any project. Note: A unique sample in the GOBII system (within a GOBii instance) is defined by the unique combination of the project_name, dnasample_name, and dnasample_number so the dnasample_name does not need to be unique within a project or across projects. For this reason, the The dnasample name in a project can also be the same name as the germplasm name. This is often the case for legacy data existing before sample tracking or LIMS systems were in place.
  • platename: The uuid: A required field for a unique ID that can be used to identify a sample or sub-samples generated from a plot or plant. We advocate the use of a Universal Unique Identifier which is a 128-bit number used to identify information in computer systems eg 123e4567-e89b-12d3-a456-426655440000. Such a uuid will help to identify and track samples generated from across multiple institutes and systems. As long identifiers may not be manageable by vendors, the sample name can be sent to the vendor, and the uuid maintained internally.
  • platename: An optional field describing the plate name that the sample is in. This can be a number (1,2,3,4, etc.), or a name given by the lab.
  • *num:  Numerical A required field describing the numerical order of the sample within a project. For example, 1-96 for a 96 well plate. Each sample needs to have a unique number within a project, unless the sample names are each The combination of dnasample name and num need to be unique within the project. Even in this case, it is always good practice to assign consecutive sample numbers to the samples in a project for ease of post-processing sorting. 
  • well_col:  The An optional field describing the plate column coordinates for the sample. For example, 1-12 for a 96 well plate.
  • well_row:  The An optional field describing the plate row coordinates for the sample. For example, A-H for a 96 well plate.

...

DNArun Metadata

Primary fields

  • *name: The A required field describing the name of the sample when it that is returned from the vendor and is associated with the genotyping data. This can be the same as the name sent to the vendor, or it could have been concatenated with a vendor ID. The translation between the name sent to the vendor (dnasample_name) and the name returned from the vendor (dnarun_name) will be provided by the vendor eg in a 'key' file. 


Properties (optional)

  • barcodeA barcode assigned to a sample. For example, for sequencing or other genotyping where samples are pooled.
Warning

"*" Mandatory fields


Loading Germplasm Metadata

  1. Select Wizards | DNA Sample Wizard. 
  2. Select PI, Project, and Experiment.
  3. Select file to load. For additional information, refer to to the Selecting the file to load below.

Selecting the germplasm file to load

  1. For local files, click Browse to browse to your file, or drag and drop the file into the file list box. If you select the wrong file, check the checkbox next to the file, then click Remove Selected File(s)
  2. For large files, load them from the files folder in your crop server environment. Create a new folder under the files folder, then copy the folder name to the Remote Path field. 
  3. If you are loading many files with the exact same format, create a template at the end of the wizard process, then save it for future use. Use templates with caution; they do not key off matching field names, but do key off row and column positions. Ensure that when you use a template, your metadata fields and data are in the exact same columns as when your template was made.
  4. Select the file format from the File Format drop-down menu. .txt and .csv files are supported for mapset files.
  5. Click Preview Data. For local files, you must log into your crop server. For remote files already placed in the crop server file folder, you already are logged in and do not need to log in again. 
  6. The top left hand section of your file is now viewable in the preview section. Expand the columns to view the headers. 
  7. Select the header position from the Header Position drop-down menu. For all marker, mapset and sample files, the headers must be in the TOP position. 
  8. Select the Field header coordinate by clicking the row of the column headers in the file preview. The row number is returned in the Field header coordinate field (0 is the top row). 



  9. Click Next to map data to the database fields. The row count starts at 0 as shown in the screenshot above.

...

  1. Select Wizards | DNA Sample Wizard.
  2. Select PI, Project, and Experiment.
  3. Select file to load. For additional information, refer to Selecting to Selecting the file to load.

Selecting the DNA sample file to load

  1. For local files, click Browse to browse to your file, or drag and drop the file into the file list box. If you select the wrong file, check the checkbox next to the file, then click Remove Selected File(s)
  2. For large files, load them from the files folder in your crop server environment. Create a new folder under the files folder, then copy the folder name to the Remote Path field. 
  3. If you are loading many files with the exact same format, create a template at the end of the wizard process, then save it for future use. Use templates with caution; they do not key off matching field names, but do key off row and column positions. Ensure that when you use a template, your metadata fields and data are in the exact same columns as when your template was made.
  4. Select the file format from the File Format drop-down menu. .txt and .csv files are supported for mapset files.
  5. Click Preview Data. For local files, you must log into your crop server. For remote files already placed in the crop server file folder, you already are logged in and do not need to log in again. 
  6. The top left hand section of your file is now viewable in the preview section. Expand the columns to view the headers. 
  7. Select the header position from the Header Position drop-down menu. For all mapset files, the headers must be in the TOP position. 
  8. Select the Field header coordinate by clicking the row of the column headers in the file preview. The row number is returned in the Field header coordinate field. The row count starts at 0.
  9. Click Next to map data to the database fields as shown in the screenshot below:
    , above

Mapping DNA Sample Metadata

  1. Click Next without mapping any terms related to Germplasm and go to DNAsample Information table directly.  
  2. The mandatory fields for both DNASample Informationa Information and DNArun/DS_DNArun Information are highlited highlighted with bluish green (cyan) color. The left box shows the terms from your data file. 
  3. Drag and drop your dnasample_name, germplasm_external code, dnasample_UUID, and dnasample_number from Data file fields to nameexternal_code, number, respectively in the DNAsample Information table. 
  4. If you have information on dnasample_platename, dnasample_well_col, dnasample_well_row are available in your data file, you can drag and drop the information in the same way as shown in the screenshot below:  

  5. Drag and drop your DNArun_name, dnasample_name, dnasample_number from Data file fields to name, dnasample_name, number, respectively in the DNArun/DS_DNArun Information table. 
  6. If you have information on dnasample_platename, dnasample_well_col, dna_well_row are available in your data file, you can drag and drop the information in the same way as shown in the screenshot above.  
  7. Use the Property table which is given below DNAsample Information table and DNArun/DS_DNArun Information table to map the terms related to DNA sample and DNArun properties . 
  8. Drag and drop your additional germplasm fields from Data file fields into the Property field and click Next.
  9. Skip mapping terms to DNA Sample Information table and just click Next.
  10. Click Finish to load the germplasm file to the database. You may need to log into the server to submit the file.