PMID- 37105115 OWN - NLM STAT- MEDLINE DCOM- 20230509 LR - 20230512 IS - 1879-0534 (Electronic) IS - 0010-4825 (Linking) VI - 159 DP - 2023 Jun TI - MLSpatial: A machine-learning method to reconstruct the spatial distribution of cells from scRNA-seq by extracting spatial features. PG - 106873 LID - S0010-4825(23)00338-4 [pii] LID - 10.1016/j.compbiomed.2023.106873 [doi] AB - MOTIVATION: Single-cell RNA sequencing (scRNA-seq) technologies allow us to interrogate the state of an individual cell within its microenvironment. However, prior to sequencing, cells should be dissociated first, making it difficult to obtain their spatial information. Since the spatial distribution of cells is critical in a few circumstances such as cancer immunotherapy, we present MLSpatial, a novel computational method to learn the relationship between gene expression patterns and spatial locations of cells, and then predict cell-to-cell distance distribution based on scRNA-seq data alone. RESULTS: We collected the drosophila embryo dataset, which contains both the fluorescence in situ hybridization (FISH) data and single cell RNA-seq (scRNA-seq) data of drosophila embryo. The FISH data provided the spatial position of 3039 cells and the expression of 84 genes for each cell. The scRNA-seq data contains the expressions of 8924 genes in 1297 high-quality cells with cell location unknown. For a comparison, we also collected the MERFISH data of 645 osteosarcoma cells with cell location and the expression status of 10,050 genes known. For each data, the cells were randomly divided into a training set and a test set, in the ratio of 7:3. The cell-to-cell distances our model extracted had a higher correspondence (i.e., correlation coefficient 0.99) with those of the real situation than those of existing methods in the FISH data of drosophila embryo. However, in the osteosarcoma data, our model captured the spatial relationship between cells, with a correlation of 0.514 to that of the real situation. We also applied the model trained using the FISH data of drosophila embryo into the single cell data of drosophila embryo, for which the real location of cells are unknown. The reconstructed pseudo drosophila embryo and the real embryo (as shown by the FISH data) had a high similarity in the spatial distribution of gene expression. CONCLUSION: MLSpatial can accurately restore the relative position of cells from scRNA-seq data; however, the performance depends on the type of cells. The trained model might be useful in reconstructing the spatial distributions of single cells with only scRNA-seq data, provided that the scRNA-seq data and the FISH data are under similar background (i.e., the same tissue with similar disease background). CI - Copyright (c) 2023 Elsevier Ltd. All rights reserved. FAU - Zhu, Mengbo AU - Zhu M AD - Department of Mathematics, Ocean University of China, Qingdao, 266100, China; Geneis Beijing Co., Ltd., Beijing, 100102, China. FAU - Li, Changjun AU - Li C AD - Department of Mathematics, Ocean University of China, Qingdao, 266100, China. Electronic address: licj@ouc.edu.cn. FAU - Lv, Kebo AU - Lv K AD - Department of Mathematics, Ocean University of China, Qingdao, 266100, China. FAU - Guo, Hongzhe AU - Guo H AD - Geneis Beijing Co., Ltd., Beijing, 100102, China. Electronic address: guohz@geneis.cn. FAU - Hou, Rui AU - Hou R AD - Geneis Beijing Co., Ltd., Beijing, 100102, China; Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, 266000, China. FAU - Tian, Geng AU - Tian G AD - Geneis Beijing Co., Ltd., Beijing, 100102, China; Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, 266000, China. FAU - Yang, Jialiang AU - Yang J AD - Geneis Beijing Co., Ltd., Beijing, 100102, China; Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, 266000, China; Chifeng Municipal Hospital, Chifeng, Inner Mongolia, 024000, China; Academician Workstation, Changsha Medical University, Changsha, 410219, China. Electronic address: yangjl@geneis.cn. LA - eng PT - Journal Article PT - Research Support, Non-U.S. Gov't DEP - 20230418 PL - United States TA - Comput Biol Med JT - Computers in biology and medicine JID - 1250250 SB - IM MH - *Software MH - *Gene Expression Profiling/methods MH - Sequence Analysis, RNA/methods MH - In Situ Hybridization, Fluorescence MH - Single-Cell Gene Expression Analysis MH - Single-Cell Analysis/methods MH - Machine Learning COIS- Declaration of competing interest Mengbo Zhu, Rui Hou, Geng Tian and Jialiang are employed in Geneis Beijing Co., Ltd., Beijing; Other authors declare that they have no competing interests. EDAT- 2023/04/28 00:42 MHDA- 2023/05/09 06:42 CRDT- 2023/04/27 18:10 PHST- 2023/02/01 00:00 [received] PHST- 2023/03/03 00:00 [revised] PHST- 2023/03/30 00:00 [accepted] PHST- 2023/05/09 06:42 [medline] PHST- 2023/04/28 00:42 [pubmed] PHST- 2023/04/27 18:10 [entrez] AID - S0010-4825(23)00338-4 [pii] AID - 10.1016/j.compbiomed.2023.106873 [doi] PST - ppublish SO - Comput Biol Med. 2023 Jun;159:106873. doi: 10.1016/j.compbiomed.2023.106873. Epub 2023 Apr 18.