PMID- 33288905 OWN - NLM STAT- MEDLINE DCOM- 20210414 LR - 20210731 IS - 1546-1696 (Electronic) IS - 1087-0156 (Print) IS - 1087-0156 (Linking) VI - 39 IP - 3 DP - 2021 Mar TI - Chromosome-scale, haplotype-resolved assembly of human genomes. PG - 309-312 LID - 10.1038/s41587-020-0711-0 [doi] AB - Haplotype-resolved or phased genome assembly provides a complete picture of genomes and their complex genetic variations. However, current algorithms for phased assembly either do not generate chromosome-scale phasing or require pedigree information, which limits their application. We present a method named diploid assembly (DipAsm) that uses long, accurate reads and long-range conformation data for single individuals to generate a chromosome-scale phased assembly within 1 day. Applied to four public human genomes, PGP1, HG002, NA12878 and HG00733, DipAsm produced haplotype-resolved assemblies with minimum contig length needed to cover 50% of the known genome (NG50) up to 25 Mb and phased ~99.5% of heterozygous sites at 98-99% accuracy, outperforming other approaches in terms of both contiguity and phasing completeness. We demonstrate the importance of chromosome-scale phased assemblies for the discovery of structural variants (SVs), including thousands of new transposon insertions, and of highly polymorphic and medically important regions such as the human leukocyte antigen (HLA) and killer cell immunoglobulin-like receptor (KIR) regions. DipAsm will facilitate high-quality precision medicine and studies of individual haplotype variation and population diversity. FAU - Garg, Shilpa AU - Garg S AUID- ORCID: 0000-0003-0200-4200 AD - Department of Genetics, Harvard Medical School, Boston, MA, USA. shilpa_garg@hms.harvard.edu. AD - Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA. shilpa_garg@hms.harvard.edu. AD - Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA. shilpa_garg@hms.harvard.edu. FAU - Fungtammasan, Arkarachai AU - Fungtammasan A AD - DNAnexus, Mountain View, CA, USA. FAU - Carroll, Andrew AU - Carroll A AD - Google, Mountain View, CA, USA. FAU - Chou, Mike AU - Chou M AD - Department of Genetics, Harvard Medical School, Boston, MA, USA. FAU - Schmitt, Anthony AU - Schmitt A AD - Arima Genomics, San Diego, CA, USA. FAU - Zhou, Xiang AU - Zhou X AD - Arima Genomics, San Diego, CA, USA. FAU - Mac, Stephen AU - Mac S AD - Arima Genomics, San Diego, CA, USA. FAU - Peluso, Paul AU - Peluso P AD - Pacific Biosciences, Menlo Park, CA, USA. FAU - Hatas, Emily AU - Hatas E AD - Pacific Biosciences, Menlo Park, CA, USA. FAU - Ghurye, Jay AU - Ghurye J AD - Dovetail Genomics, Scotts Valley, CA, USA. FAU - Maguire, Jared AU - Maguire J AD - Dovetail Genomics, Scotts Valley, CA, USA. FAU - Mahmoud, Medhat AU - Mahmoud M AUID- ORCID: 0000-0002-2553-4231 AD - Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA. FAU - Cheng, Haoyu AU - Cheng H AD - Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA. AD - Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA. FAU - Heller, David AU - Heller D AUID- ORCID: 0000-0001-8346-9565 AD - Max Planck Institute for Molecular Genetics, Berlin, Germany. FAU - Zook, Justin M AU - Zook JM AUID- ORCID: 0000-0003-2309-8402 AD - Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA. FAU - Moemke, Tobias AU - Moemke T AD - Saarland University, Saarbrucken, Germany. FAU - Marschall, Tobias AU - Marschall T AUID- ORCID: 0000-0002-9376-1030 AD - Saarland University, Saarbrucken, Germany. AD - Max Planck Institute for Informatics, Saarbrucken, Germany. FAU - Sedlazeck, Fritz J AU - Sedlazeck FJ AUID- ORCID: 0000-0001-6040-2691 AD - Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA. FAU - Aach, John AU - Aach J AD - Department of Genetics, Harvard Medical School, Boston, MA, USA. FAU - Chin, Chen-Shan AU - Chin CS AUID- ORCID: 0000-0003-4394-2455 AD - DNAnexus, Mountain View, CA, USA. jchin@dnanexus.com. FAU - Church, George M AU - Church GM AUID- ORCID: 0000-0003-3535-2076 AD - Department of Genetics, Harvard Medical School, Boston, MA, USA. gchurch@genetics.med.harvard.edu. FAU - Li, Heng AU - Li H AUID- ORCID: 0000-0003-4874-2874 AD - Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA. hli@ds.dfci.harvard.edu. AD - Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA. hli@ds.dfci.harvard.edu. LA - eng GR - R01 HG010040/HG/NHGRI NIH HHS/United States GR - RM1 HG008525/HG/NHGRI NIH HHS/United States GR - U01 HG010971/HG/NHGRI NIH HHS/United States GR - K99 HG010906/HG/NHGRI NIH HHS/United States GR - UM1 HG008898/HG/NHGRI NIH HHS/United States PT - Journal Article PT - Research Support, N.I.H., Extramural DEP - 20201207 PL - United States TA - Nat Biotechnol JT - Nature biotechnology JID - 9604648 SB - IM MH - Algorithms MH - *Chromosomes, Human MH - *Genome, Human MH - *Haplotypes MH - Heterozygote MH - Humans MH - Polymorphism, Single Nucleotide PMC - PMC7954703 MID - NIHMS1630556 COIS- F.J.S. obtained a Pacbio SMRT grant in 2019 and had multiple travels sponsored by Pacific Biosciences and Oxford Nanopore Technologies. E.H. and P.P. are employees of Pacific Biosciences. C.-S.C. and A.F. are employees of DNAnexus. A.S., X.Z. and S.M. are employees of Arima Genomics. J.G. and J.M. are employees of Dovetail Genomics. A.C. is an employee of Google. H.L. is a consultant of Integrated DNA Technologies, Inc. and on the Scientific Advisory Boards of Sentieon, Inc., BGI and OrigiMed. G.M.C. is a cofounder of Editas Medicine and has other financial interests, listed at http://arep.med.harvard.edu/gmc/tech.html. EDAT- 2020/12/09 06:00 MHDA- 2021/04/15 06:00 PMCR- 2020/12/07 CRDT- 2020/12/08 05:44 PHST- 2019/10/21 00:00 [received] PHST- 2020/09/17 00:00 [accepted] PHST- 2020/09/09 00:00 [revised] PHST- 2020/12/09 06:00 [pubmed] PHST- 2021/04/15 06:00 [medline] PHST- 2020/12/08 05:44 [entrez] PHST- 2020/12/07 00:00 [pmc-release] AID - 10.1038/s41587-020-0711-0 [pii] AID - 711 [pii] AID - 10.1038/s41587-020-0711-0 [doi] PST - ppublish SO - Nat Biotechnol. 2021 Mar;39(3):309-312. doi: 10.1038/s41587-020-0711-0. Epub 2020 Dec 7.