Nearly Complete Human Genomes Reveal Complex Genetic Variation

nearly-complete-human-genomes-reveal-complex-genetic-variation
Nearly Complete Human Genomes Reveal Complex Genetic Variation

Structural variants (SVs) are alterations in the DNA sequence that involve large-scale changes, typically longer than 50 base pairs. Advances in long-read sequencing have significantly increased sensitivity to detect SVs and were critical to assembling the first draft human pangenome reference. However, previous genome assemblies found that most centromeres and more than half of the large, highly identical segmental duplications were incomplete, resulting in missing protein-coding genes. 

In a new study published in Nature titled, “Complex genetic variation in nearly complete human genomics,” an international team of researchers from the Human Genome Structural Variation Consortium has shown that over 99% of the human genome can be accurately assembled by focusing on 65 diverse humans (130 haplotypes). 

The study closes 92% of all previous assembly gaps and reaches telomere-to-telomere (T2T) status for 39% of the chromosomes. Additionally, the authors uncovered up to 26,115 structural variants per individual for a total of more than 175,000 sequence-resolved events that were seen at least once. 

“The level of diversity within human centromeres is just remarkable,” said Glennis Logsdon, PhD, assistant professor of genetics at the University of Pennsylvania and co-lead author of the study. “We see differences in their sequence, structure, and organization that suggest these regions are evolving more quickly than we ever thought before. This rapid evolution may be important for how centromeres function and adapt over time.” 

Participants for the study came from the 1000 Genomes Project, an international research consortium established in 2007 with the goal of providing a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. 

Among the key insights from the paper are improved assemblies of Y chromosomes, which have been historically challenging to construct due to the presence of highly repetitive sequences. The researchers began determining variation for one of the most densely packed regions, known as Yq12, characterized by limited gene activity. Results suggest Yq12 is among some of the most variable regions of the human Y chromosome.  

When analyzing centromeres, where genome regions are among the most highly prone to mutations, deeper analysis found more than 4,000 new variants based on complete sequencing of 1,246 centromeres. 

The study also investigated survival motor neuron genes (SMN1/SMN2). Mutations in SMN1 are linked to spinal muscular atrophy, a genetic disease that causes muscle weakness due to the degeneration of motor neurons in the spinal cord. Given that these genes are embedded in a region of long, repeated DNA sequences, full sequencing has been historically challenging. New assemblies of this region revealed the structure and copy number of these genes among several of the study participants. The results also distinguished functional copies of SMN1 and SMN2 and shed insights into potential disease-risk sites in a few of the genomes analyzed. 

The study was led by six joint co-corresponding authors, including Evan Eichler, PhD, professor of genome sciences at the University of Washington School of Medicine and Howard Hughes Medical Institute (HHMI) investigator; Miriam Konkel, MD, assistant professor at Clemson University; Jan Korbel, PhD, head of data science at the European Molecular Biology Laboratory (EMBL); Charles Lee, PhD, professor at Jackson Laboratory; Christine Beck, PhD, associate professor at Jackson Laboratory; and Tobias Marschall, PhD, professor at Heinrich Heine University. 

The post Nearly Complete Human Genomes Reveal Complex Genetic Variation appeared first on GEN – Genetic Engineering and Biotechnology News.