This meeting provided a ton of useful feedback on GOBii and the GDM platform. This document focuses on the suggestions for improvement provided at that meeting.
CIMMYT
...
meetings on 7/28/21 and 8/4/21.
Category | Feedback | Reporter | Notes | GSD |
---|---|---|---|---|
General | Can’t complete genotyping analysis workflows smoothly | CIMMYT | ||
“run and maintain” mode at CIMMYT pending GOBii / EBS integration | CIMMYT | |||
Want automatic project creation from BMS → GOBii | ICRISAT | |||
Direct upload of standard formats and automatic metadata harvesting | ICRISAT | |||
intersection data file extractor (samples + markers) | ICRISAT | |||
Loader | Loader validations stricter than db req’s Can’t load 2 dnaruns for the same sample | CIMMYT | CIMMYT uses same experiment / project for all datasets | |
Diagnosing errors - over-reliance on help desk | CIMMYT | |||
Lacking useful transformations during loading Update marker names that are based on sequence positions | CIMMYT | |||
Steep learning curve | CIMMYT | |||
No tool for SNP recalling (for KASP markers – but unclear where/how this could happen) | CIMMYT | |||
Manual mapping is tedious different fields from different service providers | IRRI | Alleviated with web-loader, templates | ||
Errors with certain characters when importing sample files generated by B4R | IRRI | |||
Indels not supported unless encoded as +/- | IRRI | |||
server | System JIRA | |||
columns | key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution | |||
serverId | 3ed8d091-172c-31ee-8b1a-f688a0e72656 | key | https://gobiiproject.atlassian.net/browse/GSD-154||
Data often requires cleaning prior to upload | IRRI | |||
Requirement to associate data with PI | IRRI | |||
Extractor | Inflexible query system Can’t select multiple dataset types for download Can’t extract by multiple factors - e.g. intersect of markers and samples | CIMMYT | ||
File delivery system convoluted | CIMMYT | |||
QC” stats from KDC not provided to users during download | CIMMYT | |||
CAST | Cannot combine data from same samples in different datasets into one row | CIMMYT | ||
Does not facilitate the selection of marker groups to use in analyses | CIMMYT | |||
Timescope | Requires deployment of another tool instead of combining all CRUD functionalities in one tool | CIMMYT | ||
Separate authentication system, not linked to institutional authentication system | CIMMYT | |||
Data | Some data dependencies have led to unplanned processes sample linkage to project and UUID implementation have caused CIMMYT to use one project for all datasets | CIMMYT | ||
Variants have not been implemented to facilitate analyses for the “same” marker used in different platforms, potentially with different names, over time | CIMMYT | |||
Marker groups are based on markers instead of variants not linked to traits, phenotypes, etc. | CIMMYT | |||
Current data structure seems to prevent the storage of data linked to each genotypic call or data point QC values can’t be associated with data points VCF data apart from GT isn’t preserved | CIMMYT | |||
Allele frequency data cannot be stored or retrieved easily | CIMMYT | |||
Across ST and GOBii no clear model for how to store “consensus” calls or “reference” genotype or fingerprint constructed from different samples over time | CIMMYT | |||
In an integrated system, many fields of information may be duplicated and sometimes have different “IDs” e.g. ID for germplasm in CB and new ID for germplasm in GOBii | CIMMYT |
...