Objective
Store INDELs from VCF, hapmaps and InterTek formats for 2 and 4 letter nucleotide data types
Joshua Lamos-Sweeney Yaw Nti-Addae for indels of up to what size, or doesn’t that matter for the look-up table solution?
Requirements
Requirement | User Story | Importance | Jira Issue | Release | |
---|---|---|---|---|---|
1 | Place INDELs with Missing | HIGH | 2.2.3 | ||
2 | Store INDELs from VCF |
| MEDIUM |
| 3.0 |
3 | Manage + and - | + and - should be stored as is and extracted as is | TEST | ||
4 | Store INDELs from all other formats | HIGH | 3.0 | ||
5 | Extract data with INDELs | allow user to specify if they want to output INDELs, if not, then all INDELs are replaced with N/N or NN | MEDIUM | 3.0 | |
6 | Web service extract | Modify web service calls to handle INDELs. allow user to specify if they want to output INDELs, if not, then all INDELs are replaced with N/N or NN | MEDIUM | 3.0 |
User interaction and design
Open Questions
Question | Answer | Date Answered |
---|---|---|
Need INDEL definitions from Liz | draft definitions here | |
Sample InterTek files with INDELs | Note, the exact lay-out of the Intertek files could have changed (ie coordinates of first datapoint) but the genotype data show examples of how the indels could look | |
Sample VCF and other file formats with INDELs |
Downstream Pipelines
Pipeline | System/Tool | Notes |
---|---|---|
QC | KDcompute | |
DArTview | DArTview | |
Extract UI | GDM | Add separator appropriately (standard is “/”) |
BrAPI extracts | GDM | Add separator appropriately (standard is “/”) |
Flapjack bytes |