scBKAP: a clustering model for single-cell RNA-seq data based on bisecting K-means

Xiaolin Wang, Hongli Gao, Ren Qi, Ruiqing Zheng, Xin Gao, Bin Yu

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Advances in single-cell RNA sequencing (scRNA-seq) technologies allow researchers to analyze the genome-wide transcription profile and to solve biological problems at the individual-cell resolution. However, existing clustering methods on scRNA-seq suffer from high dropout rate and curse of dimensionality in the data. Here, we propose a novel pipeline, scBKAP, the cornerstone of which is a single-cell bisecting K-means clustering method based on an autoencoder network and a dimensionality reduction model MPDR. Specially, scBKAP utilizes an autoencoder network to reconstruct gene expression values from scRNA-seq data to alleviate the dropout issue, and the MPDR model composed of the M3Drop feature selection algorithm and the PHATE dimensionality reduction algorithm to reduce the dimensions of reconstructed data. The dimensionality-reduced data are then fed into the bisecting K-means clustering algorithm to identify the clusters of cells. Comprehensive experiments demonstrate scBKAP's superior performance over nine state-of-the-art single-cell clustering methods on 21 public scRNA-seq datasets and simulated datasets.
Original languageEnglish (US)
Pages (from-to)1-10
Number of pages10
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
DOIs
StatePublished - Dec 19 2022

Bibliographical note

KAUST Repository Item: Exported on 2022-12-21
Acknowledged KAUST grant number(s): FCC/1/1976-17, FCC/1/1976-23, FCC/1/1976-26, REI/1/0018-01-01, REI/1/4473-01-01, URF/1/3412-01, URF/1/3450-01, URF/1/4098-01-01
Acknowledgements: We thank anonymous reviewers for valuable suggestions and comments. This work was supported by the National Natural Science Foundation of China (No. 62172248), the Natural Science Foundation of Shandong Province of China (No. ZR2021MF098), and the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. FCC/1/1976-17, FCC/1/1976-23, FCC/1/1976-26, URF/1/3450-01, URF/1/3412-01, URF/1/4098-01-01, REI/1/0018-01-01, and REI/1/4473-01-01.

Fingerprint

Dive into the research topics of 'scBKAP: a clustering model for single-cell RNA-seq data based on bisecting K-means'. Together they form a unique fingerprint.

Cite this