PMID- 33901175 OWN - NLM STAT- MEDLINE DCOM- 20210827 LR - 20210827 IS - 1553-7404 (Electronic) IS - 1553-7390 (Print) IS - 1553-7390 (Linking) VI - 17 IP - 4 DP - 2021 Apr TI - Virus-derived variation in diverse human genomes. PG - e1009324 LID - 10.1371/journal.pgen.1009324 [doi] LID - e1009324 AB - Acquisition of genetic material from viruses by their hosts can generate inter-host structural genome variation. We developed computational tools enabling us to study virus-derived structural variants (SVs) in population-scale whole genome sequencing (WGS) datasets and applied them to 3,332 humans. Although SVs had already been cataloged in these subjects, we found previously-overlooked virus-derived SVs. We detected non-germline SVs derived from squirrel monkey retrovirus (SMRV), human immunodeficiency virus 1 (HIV-1), and human T lymphotropic virus (HTLV-1); these variants are attributable to infection of the sequenced lymphoblastoid cell lines (LCLs) or their progenitor cells and may impact gene expression results and the biosafety of experiments using these cells. In addition, we detected new heritable SVs derived from human herpesvirus 6 (HHV-6) and human endogenous retrovirus-K (HERV-K). We report the first solo-direct repeat (DR) HHV-6 likely to reflect DR rearrangement of a known full-length endogenous HHV-6. We used linkage disequilibrium between single nucleotide variants (SNVs) and variants in reads that align to HERV-K, which often cannot be mapped uniquely using conventional short-read sequencing analysis methods, to locate previously-unknown polymorphic HERV-K loci. Some of these loci are tightly linked to trait-associated SNVs, some are in complex genome regions inaccessible by prior methods, and some contain novel HERV-K haplotypes likely derived from gene conversion from an unknown source or introgression. These tools and results broaden our perspective on the coevolution between viruses and humans, including ongoing virus-to-human gene transfer contributing to genetic variation between humans. FAU - Kojima, Shohei AU - Kojima S AUID- ORCID: 0000-0002-6764-4818 AD - Genome Immunobiology RIKEN Hakubi Research Team, RIKEN Center for Integrative Medical Sciences and RIKEN Cluster for Pioneering Research, Yokohama, Japan. FAU - Kamada, Anselmo Jiro AU - Kamada AJ AUID- ORCID: 0000-0002-0437-0151 AD - Genome Immunobiology RIKEN Hakubi Research Team, RIKEN Center for Integrative Medical Sciences and RIKEN Cluster for Pioneering Research, Yokohama, Japan. FAU - Parrish, Nicholas F AU - Parrish NF AUID- ORCID: 0000-0002-6971-8016 AD - Genome Immunobiology RIKEN Hakubi Research Team, RIKEN Center for Integrative Medical Sciences and RIKEN Cluster for Pioneering Research, Yokohama, Japan. LA - eng PT - Journal Article PT - Research Support, Non-U.S. Gov't DEP - 20210426 PL - United States TA - PLoS Genet JT - PLoS genetics JID - 101239074 RN - Squirrel monkey retrovirus SB - IM MH - Betaretrovirus/genetics MH - Cell Line MH - Endogenous Retroviruses/genetics MH - Gene Expression Regulation MH - Genome, Human/*genetics MH - Genomic Structural Variation/*genetics MH - HIV-1/genetics MH - Herpesvirus 6, Human/genetics MH - Host-Pathogen Interactions/*genetics MH - Human T-lymphotropic virus 1/genetics MH - Humans MH - Linkage Disequilibrium MH - Polymorphism, Single Nucleotide/genetics MH - Viruses/*genetics/isolation & purification MH - Whole Genome Sequencing PMC - PMC8101998 COIS- The authors have declared that no competing interests exist. EDAT- 2021/04/27 06:00 MHDA- 2021/08/28 06:00 PMCR- 2021/04/26 CRDT- 2021/04/26 17:20 PHST- 2021/01/11 00:00 [received] PHST- 2021/03/25 00:00 [accepted] PHST- 2021/05/06 00:00 [revised] PHST- 2021/04/27 06:00 [pubmed] PHST- 2021/08/28 06:00 [medline] PHST- 2021/04/26 17:20 [entrez] PHST- 2021/04/26 00:00 [pmc-release] AID - PGENETICS-D-21-00026 [pii] AID - 10.1371/journal.pgen.1009324 [doi] PST - epublish SO - PLoS Genet. 2021 Apr 26;17(4):e1009324. doi: 10.1371/journal.pgen.1009324. eCollection 2021 Apr.