TY - JOUR
T1 - Genome puzzle master (GPM): An integrated pipeline for building and editing pseudomolecules from fragmented sequences
AU - Zhang, Jianwei
AU - Kudrna, Dave
AU - Mu, Ting
AU - Li, Weiming
AU - Copetti, Dario
AU - Yu, Yeisoo
AU - Goicoechea, Jose Luis
AU - Lei, Yang
AU - Wing, Rod A.
N1 - Generated from Scopus record by KAUST IRTS on 2019-11-20
PY - 2016/10/15
Y1 - 2016/10/15
N2 - Motivation: Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool - Genome Puzzle Master (GPM) - that enables the integration of additional genomic signposts to edit and build 'new-gen-assemblies' that result in high-quality 'annotation-ready' pseudomolecules. Results: With GPM, loaded datasets can be connected to each other via their logical relationships which accomplishes tasks to 'group,' 'merge,' 'order and orient' sequences in a draft assembly. Manual editing can also be performed with a user-friendly graphical interface. Final pseudomolecules reflect a user's total data package and are available for long-term project management. GPM is a web-based pipeline and an important part of a Laboratory Information Management System (LIMS) which can be easily deployed on local servers for any genome research laboratory.
AB - Motivation: Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool - Genome Puzzle Master (GPM) - that enables the integration of additional genomic signposts to edit and build 'new-gen-assemblies' that result in high-quality 'annotation-ready' pseudomolecules. Results: With GPM, loaded datasets can be connected to each other via their logical relationships which accomplishes tasks to 'group,' 'merge,' 'order and orient' sequences in a draft assembly. Manual editing can also be performed with a user-friendly graphical interface. Final pseudomolecules reflect a user's total data package and are available for long-term project management. GPM is a web-based pipeline and an important part of a Laboratory Information Management System (LIMS) which can be easily deployed on local servers for any genome research laboratory.
UR - https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btw370
UR - http://www.scopus.com/inward/record.url?scp=84995466960&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btw370
DO - 10.1093/bioinformatics/btw370
M3 - Article
SN - 1460-2059
VL - 32
JO - Bioinformatics
JF - Bioinformatics
IS - 20
ER -