PMID- 22231483 OWN - NLM STAT- MEDLINE DCOM- 20120330 LR - 20220408 IS - 1546-1718 (Electronic) IS - 1061-4036 (Print) IS - 1061-4036 (Linking) VI - 44 IP - 2 DP - 2012 Jan 8 TI - De novo assembly and genotyping of variants using colored de Bruijn graphs. PG - 226-32 LID - 10.1038/ng.1028 [doi] AB - Detecting genetic variants that are highly divergent from a reference sequence remains a major challenge in genome sequencing. We introduce de novo assembly algorithms using colored de Bruijn graphs for detecting and genotyping simple and complex genetic variants in an individual or population. We provide an efficient software implementation, Cortex, the first de novo assembler capable of assembling multiple eukaryotic genomes simultaneously. Four applications of Cortex are presented. First, we detect and validate both simple and complex structural variations in a high-coverage human genome. Second, we identify more than 3 Mb of sequence absent from the human reference genome, in pooled low-coverage population sequence data from the 1000 Genomes Project. Third, we show how population information from ten chimpanzees enables accurate variant calls without a reference sequence. Last, we estimate classical human leukocyte antigen (HLA) genotypes at HLA-B, the most variable gene in the human genome. FAU - Iqbal, Zamin AU - Iqbal Z AD - Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK. FAU - Caccamo, Mario AU - Caccamo M FAU - Turner, Isaac AU - Turner I FAU - Flicek, Paul AU - Flicek P FAU - McVean, Gil AU - McVean G LA - eng GR - 090532/WT_/Wellcome Trust/United Kingdom GR - 090532/Z/09/Z/WT_/Wellcome Trust/United Kingdom GR - 086084/WT_/Wellcome Trust/United Kingdom GR - 085532/WT_/Wellcome Trust/United Kingdom GR - WT086084/Z/08/Z/WT_/Wellcome Trust/United Kingdom PT - Journal Article PT - Research Support, Non-U.S. Gov't DEP - 20120108 PL - United States TA - Nat Genet JT - Nature genetics JID - 9216904 RN - 0 (HLA-B Antigens) SB - IM MH - *Algorithms MH - Animals MH - Base Sequence MH - Chromosome Mapping MH - Genome, Human/genetics MH - *Genotyping Techniques MH - HLA-B Antigens/genetics MH - Humans MH - Pan troglodytes/genetics MH - Sequence Analysis, DNA MH - Software PMC - PMC3272472 MID - UKMS37901 OID - NLM: UKMS37901 EDAT- 2012/01/11 06:00 MHDA- 2012/03/31 06:00 PMCR- 2012/08/01 CRDT- 2012/01/11 06:00 PHST- 2011/04/08 00:00 [received] PHST- 2011/11/07 00:00 [accepted] PHST- 2012/01/11 06:00 [entrez] PHST- 2012/01/11 06:00 [pubmed] PHST- 2012/03/31 06:00 [medline] PHST- 2012/08/01 00:00 [pmc-release] AID - ng.1028 [pii] AID - 10.1038/ng.1028 [doi] PST - epublish SO - Nat Genet. 2012 Jan 8;44(2):226-32. doi: 10.1038/ng.1028.