2018 User Feedback and GOBii Responses to Comments
Created by Star Yanxin Gao, last modified by Liz on Nov 01, 2018
Similar to 2017, a comprehensive survey was sent to CG users in August 2018 to gauge users' feedback and assessment of the GOBii project from 08/2017 to 08/2018. The 33 users surveyed represented PIs, steering teams, curators, MID breeders, developers, system administrators, and IT managers at CIMMYT, ICRISAT, and IRRI. The survey included eight categories as below, averaging six questions in each category as below:
Core GDM/GOBii functionalities
Deployment and System Administration Support
Requirement gathering process
Data loading
Data extraction
Communication and engagement
GOBii-funded tool development
Overall User Satisfaction
A simple 1 to 5 scale was used, with 1 = lowest to 5= highest satisfaction score.
Summary
We received 60% response rate, 20 out of 33 responses, with 7 from CIMMYT, 10 from ICRISAT, and 3 from IRRI and 45% representing application team or breeders, 35% developers, IT managers, or system admins, and 30% PIs or steering teams.
Similar to 2017, users' overall satisfaction to GOBii is very high for overall performance, release deployment, Dev, App, PI/Steering team engagement.
More specifically, GOBii communication and team engagement won the highest ratings across while the weak area was in GOBii core system functionalities. Specifically, some users concerned that GOBii do not have data delete or update functionalities in place or lack systematic software approaches to check validate data loaded and extracted .
2018 User Feedback and GOBii Responses to Comments
Table 1. 2018 CG Users' feedback and acceptance and satisfaction ratings: 1= lowest and 5 = highest (September 2018)
Feedback Category/Topics | CIMMYT | ICRISAT | IRRI | Average |
Core GDM/GOBii functionalities | 2.45 | 3.56 | 3.38 | 3.14 |
2.1 Data loading | 2.57 | 3.67 | 3.67 | 3.26 |
2.2 Data extracting | 2.57 | 3.88 | 4.00 | 3.39 |
2.3 Data updates | 1.67 | 3.56 | 3.00 | 2.83 |
2.4 Data deletion | 1.33 | 2.75 | 3.00 | 2.29 |
2.5 Data QC pone loading and extraction | 2.50 | 3.44 | 3.33 | 3.11 |
2.6 User authentication | 3.67 | 4.33 | 3.33 | 3.94 |
2.7 Data access control | 2.83 | 3.22 | 3.33 | 3.11 |
Deployment and Sys Admin Support | 3.31 | 3.81 | 3.93 | 3.66 |
3.1 Release and deployment process | 3.00 | 4.00 | 3.50 | 3.53 |
3.2 Pre-release QC of new features | 2.83 | 3.71 | 4.00 | 3.40 |
3.3 Sys admin support for deployment | 3.75 | 4.00 | 4.00 | 3.92 |
3.4 Sys admin support for maintenance | 4.00 | 3.71 | 4.00 | 3.85 |
3.5 Sys admin engagement | 4.00 | 4.14 | 4.50 | 4.15 |
3.6 System documentation | 3.00 | 3.50 | 4.00 | 3.42 |
3.7 Ease of system maintenance | 3.00 | 3.50 | 3.50 | 3.33 |
Requirement process | 3.21 | 3.71 | 3.80 | 3.57 |
4.1 Requirements gathering and clarification | 3.14 | 3.78 | 3.67 | 3.53 |
4.2 Requirement prioritization | 2.71 | 4.00 | 3.67 | 3.47 |
4.3 Requirement (GR) specification (clarity) | 3.50 | 3.89 | 4.00 | 3.81 |
4.4 Tracking and signing off | 3.60 | 3.33 | 4.00 | 3.53 |
4.5 Ease of submitting requirements (1=difficult, 5=easy) | 3.40 | 3.56 | 3.67 | 3.53 |
Data loading | 3.42 | 3.54 | 4.10 | 3.59 |
5.1 Handling different types of data | 3.50 | 3.43 | 4.50 | 3.62 |
5.2 Issue reporting | 3.50 | 3.43 | 4.50 | 3.62 |
5.3 Data mapping validation | 3.67 | 3.43 | 4.00 | 3.58 |
5.4 Loading large datasets | 3.25 | 3.29 | 4.00 | 3.38 |
5.5 Loading error logs and email notification | 3.25 | 4.14 | 3.50 | 3.77 |
Data extraction | 3.13 | 3.62 | 3.78 | 3.50 |
6.1 Extract features | 3.20 | 3.44 | 4.00 | 3.47 |
6.2 Data extract process | 3.40 | 3.33 | 4.00 | 3.47 |
6.3 Job status and extract output | 3.20 | 3.75 | 3.33 | 3.50 |
6.4 Data integrity | 2.60 | 3.56 | 3.67 | 3.29 |
6.5 Extraction issue reporting | 3.20 | 3.89 | 4.33 | 3.76 |
6.6 Extraction error logs and email notification | 3.20 | 3.78 | 3.33 | 3.53 |
Communication effectiveness | 4.17 | 4.39 | 4.00 | 4.25 |
7.1. Online meetings | 4.20 | 4.22 | 4.33 | 4.24 |
7.2 Effectiveness to engage and communicate | 4.17 | 4.56 | 3.00 | 4.17 |
7.3 Face to face visits | 4.50 | 4.11 | 4.00 | 4.22 |
7.4 Workshops and training | 3.67 | 4.67 | 4.33 | 4.28 |
7.5 Webinars | 4.33 | 4.38 | 4.33 | 4.35 |
GOBii-funded tool development | 3.59 | 3.74 | 4.13 | 3.77 |
8.1 QC-KDC functionalities | 3.40 | 3.56 | 3.67 | 3.53 |
8.2 GOBii-QC Integration | 2.80 | 3.63 | 3.67 | 3.38 |
8.3 F1 verification functionalities | 4.00 | 3.89 | 4.00 | 3.93 |
8.4 Line verification functionalities | 4.00 | 3.89 | 4.33 | 4.00 |
8.5 MABC functionalities | 4.25 | 3.78 | 4.33 | 4.00 |
8.6 GOBii-Flapjack Integration | 3.00 | 3.78 | 4.33 | 3.61 |
8.7 GS-Galaxy functionalities | 4.00 | 3.75 | 4.33 | 3.93 |
8.8 GOBii-GS-Galaxy Integration | 4.00 | 3.63 | 4.33 | 3.87 |
Overall User Satisfaction | 3.63 | 4.17 | 4.33 | 4.05 |
9.1 Scope and clarity of road map | 3.00 | 3.89 | 4.33 | 3.71 |
9.2 App teams management | 4.25 | 3.89 | 4.33 | 4.06 |
9.3 Development team management | 4.00 | 4.22 | 4.67 | 4.24 |
9.4 Engagement with steering teams, PIs, and SABs | 3.50 | 4.22 | 4.33 | 4.06 |
9.5 Deployment and release | 3.50 | 4.44 | 4.00 | 4.13 |
9.6 Overall satisfaction | 3.60 | 4.33 | 4.33 | 4.12 |
Grand Total | 3.31 | 3.81 | 3.92 | 3.67 |
Figure 1. shows the specific questions asked in the survey within each category, with the dotted yellow line showing average rating responses, and yellow highlight showing 95% confidence interval of the average. Responses are split according to institute where the survey was carried out.
Table 2. Comments and GOBii responses to Users' comments
Category | Respondent ID | CG User Comments | GOBii Team Response | Action items prioritized and due date | Responsible |
| 5 | Data Integrity This year we encountered a few issues related to the display of extracted data, now even while loading data with more than 10K markers data in the version 1.4. Essential features for curators like delete/modify/hide datasets/information are not ready yet. | Focus on data integrity is our highest priority from now on. Our system is fast but complex and we now need to focus on data integrity and fully pressure test the system with high volume datasets. We are bringing in a consultant to review and overhaul our QA/QC processes and take this issues very seriously. |
| Liz/Josh/Deb |
6 | Need CRUD For curation tool (GOBii GDM) to be effective it should have the basic functionalities: load, retrieve, update and delete information. | We will review how to implement full delete and update functionality by end of this year and roll out during 2019. | Yaw, Kevin | ||
7 | Core minimal: Create, Read, Update, and Delete (CRUD) As GOBII cannot do these 4 functions reliably, it is in effect not a production ready system. | We will focus on the CRUD as the core system in the next year. |
| Yaw | |
7 | It must be able to load large volumes of data quickly | We are loading a large volume of data much faster than other open-source solutions and much faster than what we have worked with in the industry where we worked | |||
7 | do at least minimal QC | ||||
7 | and be able to deliver the data for analysis | ||||
13 | Data integrity in data loading and data extraction GOBii is making great progress on building tools for manual data loading and data extraction. But it is hard to say that we are completely satisfied when there have been errors in the data loading and data extraction processes. |
| Yaw | ||
13 | Software QC to improve new release But, the team has been quick to respond to these and we know that much stronger software QC processes are now in place and continue to improve the quality and confidence of each new release. | ||||
13 | Flex query extraction For data extraction, version 1.4 offers many basic extraction functionalities, but, we are very excited to see new features that will come into place with flex query for the future. | Flex query is in the post V 1.5 release | Kevin, Phil | ||
13 | Hard delete-TimeScope We also look forward to having basic tools for deleting data in the coming year. | Hard delete will be released in the Timescope tool in v1.5 We will train system admin/super users | Update TimeScope user and release documentation Schedule to user training for appropriate use of Timescope | Deb, Roy | |
13 | Soft delete to restrict data access For data access control, there is currently no way to restrict access to certain data(sets) within the system and that could be problematic in certain situations. | We will create access control and are starting to implement delete functionality with the Timescope tool. We agree –we will incorporate into the CRUD roll out through 2019 | Design data access control by end of 2018 Roll out functionality through 2019 | Yaw/Kevin | |
13 | flag data status Also, it might be good to be able to associate some kinds of flags, e.g. related to data status that could be applied during the data extraction or utilization process | Need more info on what this would look like | |||
| 6 | Deployment Some install parameters are not well documented. We should think of ways to by-pass lengthy back up process during deployment and updates. Slowed down due to lengthy back-up | We are improving the back-up and restore process by implementing incremental back-up | Q4 2018 | Roy, Kevin |
7 | The system should be highly configurable so that it can be deployed in a wide range of enterprise IT contexts e.g. different authentications, email services, in cloud or on premise etc. | We agree and will work to simplify, automate and document configuration management | Design improvements to Q4 2018. Roll out Q1 and Q2 2019 | Roy | |
7 | Deployment should be amenable to being scripted/automated, which means e.g. all configurations should be with tokens that can be set with a script. | Roy | |||
7 | To ensure stable deployment, changes must be managed and documented, and impact on release process well documented. | Documentation will be improved | Continuous | Roy | |
13 | We have had some challenges with deployments in the past year, but I think that the process is always improving. Also, in general, we know that the QC process has gotten much better with great testing and great reporting on bug fixes during each release (with test files, etc.). But, there may still be some room for improvement, especially as the number of variations for any new feature that need to be tested increase. Bugs fixing tracking and communication management Also, it has been a bit frustrating in the small number of cases where we thought that something was fixed and then later learned that it was not. | Star, Josh | |||
| 7 | Ensure inter-dependencies understood among requirement in prioritization The fundamental issue in the requirement gathering is the lack of dependency analysis. Users will tend to ask for the analytical functions of any system as this is where the business value is, but these all require the basic CRUD functions. If this is not achieved, the module cannot work within practical breeding. No normal user will up front think of asking for e.g. a delete or update function in a data base, but that does not mean it is not a requirement for the system. The requirement prioritization must both consider user requirements, but also technical IT requirements, and the inter dependencies must be understood in the prioritization process. | Yaw, Kevin, Josh | ||
13 | The requirements gathering and prioritization process is still a little unclear. But, I am not sure how it can be refined when there are many different participants with their own priorities. And I know that it must be challenging as the requirements and priorities expressed by groups change over time. | Liz, Yaw | |||
| 13 | Track issues and requirements and communication management It is generally easy to report issues but it is not always clear when they are being worked on, especially if they go in by email. We are grateful that the standards for submitting error reports are still relatively loose, but we want to continue to work together to make sure that we are providing the minimal standardized information needed to test the error without taking too much time (especially when it's not known if the error was already reported, etc.). |
|
| Roy and Deb |
13 | Informative error logs for trouble shooting /diagnose The process for defining new requirements is less clear to us. The error logs provided with failed data loads generally do not provide information that allows us to diagnose the specific problem that caused the failure. |
| Josh | ||
| 5 | Need data validation tools to validate data for loading and extraction A few issues related to extracted data accuracy were reported to GOBii and required some tests to fix the problems and validate them. I understand that this is the part of the database development process but there is no existing tools for curators to validate data once they are loaded and extracted. | |||
5 | Need systematic data validation tool to handle large dataset It is difficult especially when datasets are large and we use random data validation approaches unless we are lucky enough to identify issues by chance | ||||
6 | Proved extract accuracy for patch fixing There are some issues on extract accuracy, I understand that there are already efforts to "patch" the affected data set. But we need to prove that this works. | Deb, Josh | |||
13 | The current extraction features are not very extensive but the flex query tool that will be available in the next version seems to add much more functionality. | Flex query will be released post v1.5 | Kevin, Phil | ||
13 | Usability- Navigation to folders is not user friendly The solution for navigating to folders to get data works pretty well for curators ,but may not seem very user-friendly to other users. | We agree and have extensively reviewed web-based browsers that can provide live links to files and that have authentication. We have identified OwnCloud and are incorporating into our emails and into the Marker Toolbox | Q4, 2018 | Yaw, Josh | |
13 | Informative error log The error logs generally do not provide enough information to help us figure out why an extract might have failed. We've also had some concerns, especially before v.1.4 about the integrity of the extracted data. But the data extraction is very fast! | ||||
14 | I have not been using GOBii yet, and have not interacted with core functions. I only know second hand of some issues with data import and extract. | ||||
| 13 | Training and workshops I put a lower score for the workshops and training mostly because I think that they have generally been premature. It likely does not make sense to provide any GOBii "training" or GS pipeline "training" to users outside the 3 first CGs until there is a system or a stable completed pipeline that people can use. The minutes from the online meetings are generally very helpful. And targeted meetings in person are also usually quite helpful. | Our project required us to reach out to additional CGs and NARS at this stage of the project, but we agree that we have found it is premature before the database is available to them. We do find the tools we have developed are well received by wider partners though. We do have limited resources though and in 2019 will focus our training on our CG center partners and related NARS | We will focus our efforts through 2019 on our existing CG center customers | Star |
| 7 | In the case of GOBII, the team started working on analytics functions based on the user requests, much before the foundational functions were finalized, and the result is that we now have disperse partial functions that do not make up a working system. | |||
13 | QC tools with KDCompute KDC provides some good statistics, but, it is unclear how to use these in a functional production workflow. | We agree that having KDCompute in place is only the first step in the process and we have probably not promoted the long term view of the process sufficiently. We will next provide a mechanism to filter the data and reload to GOBii, The non-QCd dataset can then be deleted or eventually will be able to be hidden from the user using ‘soft-delete’ |
| ||
13 | QC tools with KDCompute It may also be confusing to disentangle the current data that should be used for "genotypic data" QC versus "germplasm QC," e.g. the the F1 verification information. | All tests in KDCompute are meant to QC genotyping data. Both data quality and genetic quality can be useful for this purpose. | |||
14 | Batch processing MABC using Galaxy workflow Following the recent training, I can see how Galaxy can be helpful for streamlining the MABC batch processes. I'm willing to try this, but would need to schedule an online session to go over the steps once I have data sets ready. | ||||
14 | Passing genealogies from breeding management systems I think the F1 and line verification basics are in place, but would like to see how the system would work for evaluating hundreds of F1s against their parents, or to assess all new lines for parent verification. I think some of the requirements for this sort of workflow would require information from the enterprise breeding system which I don't know is there (i.e. Parent 1, Parent 2 meta-data). | We are re-evaluating how to F1 pedigree verification tests are functioning for high volume ‘real-life’ datasets. We are seeing that our tools do not well manage some use-cases. These will be redesigned in Flapjack and KDCompute to accommodate these use cases | Through 2019 | Carlos | |
14 | I don't know how the genealogies are tracked in EBS/BMS and how that information could be used efficiently in an F1 or line validation tool. | We are re-evaluating how to F1 pedigree verification tests are functioning for high volume ‘real-life’ datasets. We are seeing that our tools do not well manage some use-cases. These will be redesigned in Flapjack and KDCompute to accommodate these use cases | As appropriate | Carlos | |
15 | No one-stop shop workflow yet | ||||
19 | Flapjack - has potential for being the most routinely used, most sought after and immediate impact on usage by breeding community; especially for QC and MAS stuff /early generation screening in future; GS-Galaxy - needs more build-up and user cases; this could be next in line fo impact | We have incorporated some breeding management fields into GOBii that are needed for marker data analysis and to accommodate not having breeding management systems up and running at CG centers. As we integrate with breeding management systems we will transition to pulling that information from the correct authoritative system. | |||
| |||||
10. Scope and prioritization | 6 | Core curation functionality and accuracy of extract should take precedence over these. | |||
7 | Given that only limited time remains ALL possible resources must be re-prioritized to address these basic functions as all other functions depend on this. This most likely implies reducing e.g. analytics and training work in order to secure basic functionality of high quality. GOBII need to recognize that the core genotypic data module is not finalized, and considering how little time is left, focus should be on creating a core genotypic data base with a complete CRUD and some QC functions. Tools is nice to have, but not much use if the underlying genotyping pipeline is not efficient. Given how little time is left of GOBII, it is critical to analyze what is the minimal functionality of a core genotypic data base module and focus on making that work. | Agree, as above we are focusing on the core system | |||
| 7 | Integration with Breeding management systems For GOBII to be used widely, only require SampleIDs. All additional germplasm data should be removed from core module and be optional-should come from the germplasm system. | Our schema was developed to compensate for the lack of breeding management systems in most institutes and this was before we fully understood sample tracking use cases. But now, we agree, we need to be able to accommodate both the existence and non-existence of breeding systems. We are revising the schema to accommodate both scenarios easily and will roll out a migration plan in the next 2 months | Implementation plan to accommodate only having sampleIDs – Q4 2018 Likely implement by Q2 2019 | Kevin |
13 | Integration with Breeding management prioritize and ensure joint development efforts, e.g. to ensure that a breeding system can generate a query for information to GOBii and then get back information into a fieldbook, | We are focusing on the sample tracking use-case for integration. If we can coordinate and standardize sample tracking and APIs then these queries will be straight forward. We will continue to engage through BrAPI and drive coordination of data entries across GOBii institutes and HTPG projects |
| ||
13 | Tool development how to determine how much effort should be invested in trying to create a tool that can support smaller scale programs versus creating a tool that really has almost no front end and needs to plug into other breeding data systems, etc., to be functional. | ||||
19 | Need a very clear and quickest timeline for production scale implementation for routine use |
For clarity of overall scope and road map, users are somewhat unclear as the figures below showing quite diverse interests and priorities.
Figure 3. GOBii scope and prioritization (multiple selection)
LikeBe the first to like this
No labels
Write a comment…