NeuralPolish: a novel Nanopore polishing method based on alignment matrix construction and orthogonal Bi-GRU Networks

Neng Huang, Fan Nie, Peng Ni, Feng Luo, Xin Gao, Jianxin Wang

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

Motivation Oxford Nanopore sequencing producing long reads at low cost has made many breakthroughs in genomics studies. However, the large number of errors in Nanopore genome assembly affect the accuracy of genome analysis. Polishing is a procedure to correct the errors in genome assembly and can improve the reliability of the downstream analysis. However, the performances of the existing polishing methods are still not satisfactory. Results We developed a novel polishing method, NeuralPolish, to correct the errors in assemblies based on alignment matrix construction and orthogonal Bi-GRU networks. In this method, we designed an alignment feature matrix for representing read-to-assembly alignment. Each row of the matrix represents a read, and each column represents the aligned bases at each position of the contig. In the network architecture, a bi-directional GRU network is used to extract the sequence information inside each read by processing the alignment matrix row by row. After that, the feature matrix is processed by another bi-directional GRU network column by column to calculate the probability distribution. Finally, a CTC decoder generates a polished sequence with a greedy algorithm. We used five real data sets and three assembly tools including Wtdbg2, Flye and Canu for testing, and compared the results of different polishing methods including NeuralPolish, Racon, MarginPolish, HELEN and Medaka. Comprehensive experiments demonstrate that NeuralPolish achieves more accurate assembly with fewer errors than other polishing methods and can improve the accuracy of assembly obtained by different assemblers.
Original languageEnglish (US)
JournalBioinformatics
DOIs
StatePublished - May 11 2021

Bibliographical note

KAUST Repository Item: Exported on 2021-06-11
Acknowledged KAUST grant number(s): FCC/1/1976-26-01, URF/1/3412-01, URF/1/4098-01-01, REI/1/4473-01-01
Acknowledgements: This work was supported in part by the National Natural Science Foundation of China under Grants (Nos.U1909208 and 61772557), 111
Project (No. B18059), Hunan Provincial Science and Technology Program (No. 2018wk4001) to J.W., the U. S. National Institute of Food and Agriculture (NIFA) under grant 2017-70016-26051 and the U.S.National Science Foundation (NSF) under grants ABI-1759856 to F.L, and the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. FCC/1/1976-26-01, URF/1/3412-01-01, URF/1/4098-01-01, and REI/1/4473-01-01 to X.G.

ASJC Scopus subject areas

  • Biochemistry
  • Computational Theory and Mathematics
  • Computational Mathematics
  • Molecular Biology
  • Statistics and Probability
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'NeuralPolish: a novel Nanopore polishing method based on alignment matrix construction and orthogonal Bi-GRU Networks'. Together they form a unique fingerprint.

Cite this