Tag Archives: genome

A better potato: researchers sequence the tuber’s entire genome for the first time ever

Researchers at the Max Planck Institute for Plant Breeding Research have set the groundwork for supercharging the potato, by mapping out the tuber’s complete genome.

Image credits James Hills.

Fried, mashed, or thrown in a stew, the humble potato has a special place in our hearts and our plates that nothing else seems to be able to fill. Researchers seem to love this tasty tuber as well, and have put significant effort into decoding its genetic secrets. This impressive work will allow us to create better varieties of potato much faster than traditional breeding methods allow for, with implications for the quality of our meals, the enjoyment we derive from it, and global food security.

Super Tuber

“The potato is becoming more and more integral to diets worldwide including even Asian countries like China where rice is the traditional staple food. Building on this work, we can now implement genome-assisted breeding of new potato varieties that will be more productive and also resistant to climate change — this could have a huge impact on delivering food security in the decades to come.”

The potato has not changed very much in the last 100 years or so. The overwhelming majority of varieties that are available in shops today are the same ones that were put to market over the last century and before. While these traditional cultivars are very popular, they do underline that there is a lack of variety of potatoes being grown, cooked, and enjoyed around the world. Thus, it stands to reason that improvements can be made to the baseline potato in order to make it more palatable, more resilient, or more abundant.

That’s what the team at the Max Planck Institute for Plant Breeding Research hopes to achieve with the full sequencing of the plant’s genome. The work, led by geneticist Korbinian Schneeberger, represents the first full assembly of the potato genome in history, allowing for researchers to work with a much better view of the plant’s genetic intricacies, and thus much more accuracy when trying to breed new varieties of the plant.

Low genetic diversity within a species — and the potato is a good example of one such species — means that it can have difficulties thriving in certain contexts, and leaves it vulnerable to disease. The near-extinction of the Gros Michel banana due to the Panama disease is a great example of such a genetic vulnerability at work. In the case of the potato, the Irish famine of the 1840s stands testament to how completely potato crops can be wiped out by pathogens. During this tragic event, Europeans were growing a single variety of potatoes, which was vulnerable to blight; as such, potato crops failed across the continent.

The Green Revolution of the 1950s and 60s saw a great diversification of crop varieties in staples like rice or wheat, but not potatoes. Efforts to breed new varieties with higher yields or more disease resistance have, so far, remained largely unsuccessful.

Potatoes, the team explains, inherit two copies of each chromosome from every parent — unlike humans, who inherit one copy of every chromosome from their parents. This makes them a species with four copies of each chromosome, a ‘tetraploid’, making them exceedingly difficult and slow to be coaxed into generating new varieties with desirable combinations of traits.

The same tetraploid structure also makes it technically difficult to reconstruct the potato’s genome.

To work around this issue, the team sequenced the DNA of potatoes working not with mature plants, but with large numbers of individual pollen cells. These contain only two copies of each parent chromosome, which made it easier for the team to use established genetic methods to reconstruct the plant’s genome.

The results should give scientists and plant breeders a powerful new tool with which to identify desirable gene variants in the potato and work to establish new varieties that contain them. Essentially, it gives them a baseline against which they can reliably compare individual plants and establish exactly where their desirable properties originate — and then work to reproduce them.

The paper “Chromosome-scale and haplotype-resolved genome assembly of a tetraploid potato cultivar” has been published in the journal Nature Genetics.

Scientists may have finally sequenced the entire human genome

In 2003, after nearly $3 billion in funding and 13 years of painstaking research, scientists with the Human Genome Project (HGP) announced they had finally mapped the first human genome sequence. This was a momentous breakthrough in science that would revolutionize genomics. However, the initial draft and updates of the human genome sequence that followed were not 100% complete. But now, scientists with the  Telomere-to-Telomere (T2T) Consortium claim they’ve addressed the remaining 8% of the human genome that was missing.

“The Telomere-to-Telomere (T2T) Consortium has finished the first truly complete 3.055 billion base pair (bp) sequence of a human genome, representing the largest improvement to the human reference genome since its initial release,” wrote the scientists in a paper published in the pre-print server bioRxiv, meaning it has yet to be peer-reviewed.

The first truly complete genome of a vertebrate

The genome is the sum of all the DNA and mitochondrial DNA (mtDNA) sequences in the cell. It contains all the instructions a living being needs to survive and replicate, consisting of chemical building blocks or “bases” (G, A, T, and C), whose order encodes biological information.

In diploid organisms, such as humans, the size of the genome is considered to be the total number of bases in one copy of its nuclear DNA. Humans and other mammals contain duplicate copies of almost all of their DNA. For instance, we have pairs of chromosomes, with one chromosome of each pair inherited from each parent. But scientists are only interested in sequencing the sum of the bases of one copy of each chromosome pair. A person’s actual genome is roughly six billion bases in size, but a single “representative” copy of the human genome is about three billion bases in size.

Because the human genome is so large, its bases cannot be read in order end-to-end in one single step. What HGP scientists did to sequence the genome was to first break down the DNA into smaller pieces, with each piece then subjected to various chemical reactions that allowed the identity and order of its bases to be deduced. These bits and pieces were then put back together to deduce the sequence of the starting genome.

Although genome sequencing technology has advanced a lot since the HGP announced the first draft of the human genome in 2001, a complete sequence of the entire genome was never achieved. Around 8% of the genome was missing, which corresponds to areas where DNA sequences are made up of long repeating patterns. Some of these repeating patterns, such as those found in the centromeres of chromosomes (the ‘knot’ that ties chromosomes together), play important biological roles, but standard technology hasn’t been able to decode them properly.

Using revolutionary new technology, scientists affiliated with T2T now claim that they’ve filled these gaps.

“You’re just trying to dig into this final unknown of the human genome,” Karen Miga, a researcher at the University of California, Santa Cruz, who co-led the international consortium, told STAT News. “It’s just never been done before and the reason it hasn’t been done before is because it’s hard.”

According to Miga and colleagues, the genome breakthrough was made possible thanks to new DNA sequencing technologies developed by Pacific Biosciences in California and Oxford Nanopore in the UK. These technologies do not cut the DNA into tiny pieces for later assembly, which can result in errors. Instead, Oxford Nanopore tech runs the DNA molecule through a nanoscopic hole, resulting in a long sequence. Meanwhile, lasers developed by Pacific Biosciences read the same DNA sequence again and again, which makes the readout far more accurate than previous technology.

Both technologies complemented each other to reveal the missing parts of the genome that have been eluding scientists for almost two decades. According to TNT, the number of DNA bases has been increased from 2.92 billion to 3.05 billion, marking a 4.5% improvement. However, the number of genes only increased by 0.4%, to 19.1969 — that’s because the vast majority of DNA sequences do not code for proteins but rather regulate the expression and activity of these genes.

“The complete, telomere-to-telomere assembly of a human genome marks a new era of genomics where no region of the genome is beyond reach. Prior updates to the human reference genome have been incremental and the high cost of switching to a new assembly has outweighed the marginal gains for many researchers. In contrast, the T2T-CHM13 assembly presented here includes five entirely new chromosome arms and is the single largest addition of new content to the human genome in the past 20 years,” wrote the researchers.

“This 8% of the genome has not been overlooked due to its lack of importance, but rather due to technological limitations. High accuracy long-read sequencing has finally removed this technological barrier, enabling comprehensive studies of genomic variation across the entire human genome. Such studies will necessarily require a complete and accurate human reference genome, ultimately driving adoption of the T2T-CHM13 assembly presented here,” they added.

The genome that the researchers sequenced didn’t come from a person but rather from a hydatidiform mole, a rare mass or growth that forms inside the womb (​uterus) at the beginning of a pregnancy. This tissue forms when sperm fertilizes an egg with no nucleus, so it contains only 23 chromosomes, just like a gamete (sperm or egg), rather than 46 found in the DNA of a human’s cell. These cells make the computational effort simpler but may constitute a limitation.

We will find out more once the paper is peer-reviewed and properly scrutinized by the international scientific community. If the findings hold water, they may mark a new age of genomics — one where no nook or cranny of DNA is left unexplored. 

Credit: Pixabay.

You’ve heard about genome sequencing — but what’s exome sequencing?

Image in public domain (via Wiki Commons).

Despite our differences, human beings share 99.9% of the genome. In other words, we all differ by a mere 0.1% of genes, which triggers the difference in the way we appear, grow, and develop.

Over 80% of rare diseases are caused by genetic mutations in that miniscule difference, and it’s estimated that such undiagnosed diseases affect about 8% of our population. Detecting such diseases is challenging, but researchers are working on new promising techniques.

Potential forms of diagnosis for rare and undiagnosed diseases include:

  • Next Generation Sequencing (NGS), which refers to all large-scale DNA sequencing methods that allows for mapping the entire genome (whole genome sequencing);
  • whole exome sequencing — focusing on just the exons within all known genes
  • target gene panel (or only exons of selected genes). 

To understand whole exome sequencing (WES), we need dive into the world of our genetic makeup.

Four letters

The nucleus of every cell in the human body consists of 23 pairs of chromosomes, which makes 46 chromosomes in every cell. These chromosomes are in turn made of double stranded DNA. 

DNA is made up of genes that are built on nucleotides. The human genome consists of 20,000 genes and 3 billion nucleotides or “letters.” The’ letters’ are organic molecules, namely- Guanine (G), Thymine (T), Adenine (A) and Cytosine (C). G, T, A and C are arranged in specific sequences in our genes, subsequences that translate into proteins.

But not all 3 billion nucleotides translate into proteins. In fact, only a small percentage (about 1.5%) of these nucleotides, are actually translated into proteins. These are “EXpressed regiONS”, or exons.

This has led to the herald of Whole Exome Sequencing, or WES. While the cost of sequencing the entire genome is still out of reach, the cost of sequencing just the exons (aka the  functional part of the genome) is low enough that it has been used to find genetic abnormalities leading to rare diseases. It is also much easier to sift through this data. 

The complementary, “INTragenic regiON” or introns in genes are not represented or translated in proteins. 

Whole Exome Sequencing Whole Genome Sequencing 
Sequencing only the “coding DNA” or 30 million lettersLess intensive analytically and has lower storage requirementsLess expensive($1000 commercially)High sequencing depth of protein coding regions 

Ability to detect certain types of alterations may be limited Includes newly characterized  and novel genes 
Sequencing “all” DNA , introns and exons- 3 billion lettersMore variants to analyze and more storage requirements Much more expensive ($20,000 commercially)Extensive and uniform coverage of genome at a lower sequencing depth, both protein coding and non coding regions of genome Can detect more types of alterations than exome sequencing Includes newly characterized  and novel genes Can detect up to 10-15% more diagnoses than WES

The National Organization for Rare Disorders (NORD) at NCSU hosted Dr. Vandana Shashi, a pediatric genetics specialist at Duke University on April 22. Shashi she served as the co-chair of NIH’s Rare and Undiagnosed disease network and shared her perspectives on Exome and Genome Sequencing.

Given the nature of the method, some alterations that are not reliably detected by WES include- deep intronic non-coding region defects, pseudogenes and repeat regions. As WGS becomes cheaper and more accessible, Dr. Shashi sees this method eclipsinging WES. 

“I do see WGS  becoming cheaper and more accessible in the future, this will eventually eclipse WES,”Dr Shashi said, “WGS is a lot better at capturing copy number variants, i.e, deletions and duplications that are larger than 50 base pairs.”

In Whole Exome Sequencing, there are three steps DNA is prepared, sequenced and processed. 

Step 1- DNA Library Preparation 

  • Shear DNA – First, genomic DNA is sheared into random short fragments of about 300 base pairs. 
  • Blunting-  When an enzyme is used to chop the DNA into small parts, it leaves ends of uneven length on the double strand, depending on which strand is larger and which is shorter, the base pairs are either removed, or the missing base pairs are filled in by an enzyme producing “blunt” ends of equal length.
  • A-tailing- The blunted ends are then modified by adding a single adenine (A) nucleotide that forms an overhanging “A-tail”.
  • Add adapters- The sample is  flanked by ligate adaptors to allow sequencing.
  • Enrich library for Exome capture- Sequences that correspond to exons are captured by hybridization to DNA or RNA baits and then pulled down by coated magnetic beads. The selected fragments create a library enriched with exomes. 

Step 2- DNA Sequencing 

Exome capture  is followed by amplification of the sample and massive parallel sequencing. Massively parallel second generation sequencing (aka next generation sequencing) generates billions of base pairs of data. Barcodes to allow sample indexing, can be introduced at this step.

“You attach DNA to the flow cells and amplify. Basically, you are doing a number of simultaneous PCR reactions (Multiplex PCR reactions). Then you read the sequence  and you get a lot of fragments,” Dr Shashi said,”these fragments come in short reads of 100-115 base pairs long,” 

Step 3- DNA Analysis

The next step is DNA Alignment and Variant calling.

“These fragmented base pairs from the previous step are overlapped with one another and they are compared against a reference genome,” Dr. Shashi said. 

A reference genome here is a so-called ‘normal genome’ or a  representative example of the set of genes in one idealized individual of the human species. 

Bioinformatics tools are then  used in DNA analysis, they usually use one of these three file formats- 

It offers full sequencing of data and a corresponding quality score. Each sequence filtering gets entered as a 4 line format. 
Very large file formats, requires a lot of storage space. 
Binary Alignment Map, facilitates alignment of FASTQ to a reference genome 

Very large file formats, requires a lot of storage space 
Variant Call FormatStandardized text file for representing  Single Nucleotide Polymorphisms (SNP), Insertions and Deletions in the genome (INDEL) and corresponding variationsMost commonly used format 

Courtesy Twist Biosciences 

Dr Shashi used this method to diagnose a 20 month old with Brown–Vialetto–Van Laere Syndrome 2 (BVVLS2) (Shashi et. al) ,  they used high-dose riboflavin therapy or large doses of Vitamin B2 to stabilize the degressing neurological condition of the child. 

Whole exome Sequencing is shaping up to be the most exciting advance in the world of genetics and could possibly be a much larger stepping stone in the world of undiagnosed and rare disorders. Stay tuned as we keep up with this evolving bio technology.  

Scientists sequence genome of Fleming’s original penicillin-producing fungus

A group of researchers successfully sequenced the genome of the mold that produced penicillin, the world’s first true antibiotic, using samples frozen alive more than fifty years ago. The team compared Alexander Fleming’s original sample of penicillium mold to two strains of mold now used to produce the substance today.

The freeze-dried Fleming strain from which the Penicillium fungus was grown and genome sequenced. Credit CABI.

Back in 1928, biologist Alexander Fleming noticed Penicillium mold growing in a culture of Staphylococcus aureus he was studying. It appeared the experiment was wrecked but Fleming noticed that where the mold grew, the bacteria didn’t. He later identified the chemical compound that was fatal to the bacteria and called it penicillin in honor of the humble mold.

Fleming froze samples of the mold that produced his first isolated samples of pure penicillin. More than 50 years later, a group of researchers at Imperial College London and the University of Oxford decided to look them up. They compared the samples with the genomes of two modern strains of Penicillium mold, now used in the United States.

“We originally set out to use Alexander Fleming’s fungus for some different experiments, but we realized, to our surprise, that no one had sequenced the genome of the original Penicillium, despite its historical significance to the field,” said Timothy Barraclaugh, co-author, in a statement.

The researchers found a subtle difference between the two genomes, which might help us better combat antibacterial resistance. Most antibiotics are based on chemicals that fungi or bacteria produce to defend themselves. If you get a dose of penicillin, it was likely produced by mold cultures, which are descendants of samples taken from moldy cantaloupes.

Over the years, antibiotics manufacturers bred their cantaloupe mold cultures to produce more penicillin. This means the genomes of modern industrial Penicillum mold are probably very different from their cantaloupe-eating ancestors.

The team looked at two sets of genes in particular. The ones that coded for chemicals called enzymes and the ones that control how much of an enzyme to make and when. They found that modern strains had more copies of the genetic instructions for making those enzymes, which meant those cells would make more enzymes and thus more penicillin.

While nature favors the traits that make mold more likely to survive and pass on its genes, artificial selection by humans cares about penicillin production over everything else. But Fleming’s mold and the modern strains used different versions of the enzymes that make penicillin. This could be due to evolution in the lab or because the strains are from different continents and evolved different enzymes.

If that’s the case, those different enzymes might produce different versions of penicillin. Still, there’s not enough data now to say exactly how the different enzymes impact the final product. The difference could lead to more efficient penicillin production, more effective penicillin, or a way to work around at least some of the resistance certain bacteria strains have evolved to the drug, the researchers believe.

“Industrial production of penicillin concentrated on the amount produced, and the steps used to artificially improve production led to changes in numbers of genes,” Ayush Pathak, lead author, said in a statement. “But it is possible that industrial methods might have missed some changes for optimizing penicillin design, and we can learn from natural responses to the evolution of antibiotic resistance.”

The study was published in the journal Scientific Reports.

What DNA can tell us about the transatlantic slave trade in the Americas

Schematic shows general direction of the triangular trade routes between continents during the transatlantic slave trade. Credit: Micheletti et al./ The American Journal of Human Genetics.

Over 10 million African natives were forcefully taken as slaves and transported to the Americas to live the rest of their days as slaves. The same fate would befall their children for hundreds of years until the end of the 19th century. This dark and shameful episode in humanity’s history is etched in the genomes of people alive today in North, Central, and South America, as well as the Caribbean. In a new study, researchers tapped into this DNA to reveal new insights about the trans-Atlantic slave trade, some of which have scaped written records.

“Our study combined the genetic data of more than 50,000 people on both sides of the Atlantic with historical records of enslaved people to create one of the most comprehensive investigations of the transatlantic slave trade,” says first author Steven Micheletti, a population geneticist at 23andMe, who employed genetic data from the  Intra-American Slave Trade database. “One of the disturbing truths this research revealed was how the mistreatment of people with African ancestry shaped the current genetic landscape of African ancestry in the Americas.”

The researchers at 23andMe, a personal genomics and biotechnology company based in Sunnyvale, California, found that most African-Americans have their roots in Angola and the Democratic Republic of Congo. That was to be expected given historical records, but there were surprising patterns of migration and interbreeding that came to light when the researchers had a closer look.

One such finding was that Nigerian ancestry is over-represented in African Americans in the United States, mainly due to the “transport of slaves within the Americas, primarily from the Caribbean,” according to senior author Joanna Mountain, Senior Director of Research at 23andMe.

In contrast, the genetic flow between African Americans and Senegambians was much much lower than expected. During the height of the trans-continental slave trade, people from Senegal and Gambia disembarked in North America in great numbers.

But even this revelation wasn’t all that surprising when the researchers examined the historical records. Senegambians were mostly put to work on rice plantations in the US, which were infested with malaria and had very high mortality rates. As such, most died before they had the chance to pass on their genes.

The researchers were also able to infer the effects of policies enacted by slave-owners and local governments across the Americas. In some cases, the policies could be so different that the DNA differences became striking across many generations.

For instance, slave owners in the U.S. wanted enslaved Africans to have children with one another in order to boost the workforce. Even after slavery was abolished in the U.S. with the passing of the 13th Ammendment, those of African descent still found themselves segregated.

In contrast, slaves of African descent living in Latin America often interbred with white-skinned Europeans — in fact, the practice was very much encouraged in order to promote the “dilution” of the African ancestry. Today, the proportion of people with greater than 5% African ancestry is five times lower in Latin America than in the US, despite the fact that 70% of all African slaves disembarked in South America.

The Latin American dilution was mainly promoted between darker-skinned females and white-skinned Europeans — and this shows to this day in the DNA of their living descendants.

“Our analysis estimated about 15 African women had children for each African man in Central and South America, as well as the Latin Caribbean,” says Micheletti.

“The female bias is particularly shocking given that the majority of enslaved individuals were male,” he added, mentioning that female gene bias was also strong in North America.

Previously, another study published earlier this year by Brazilian researchers who studied the DNA of 6,267 individuals with more than 10% African ancestry from 25 populations came to similar conclusions. The study found that West Central African ancestry (from countries such as Nigeria and Ghana) is the most common in the Americas. West African ancestry (i.e. Senegal and Gambia) increases going northward while Bantu ancestry (from the south and southeast Africa) is more significant in the South of Brazil.

“The African Diaspora was so massive (>9 million people), that the genetic diversity observed in the African portions of our admixed genomes is similar to that of African populations of origin of slavery. However, admixture homogenized this diversity (and the mutations responsible for diseases) between the different populations of the African continent,” lead author Eduardo Tarazona-Santos, a researcher at the Federal University of Minas Gerais in Brazil and lead author of the study, told ZME Science.

These findings might not only provide new insights into the tragic slave-trade, but might also enable those of African descent to find their roots and come to a better understanding of what their ancestors had to go through.

“This paper conveys how the racist and dehumanizing acts endemic to the slave trade led to different patterns of African ancestry across the Americas that we can see in the DNA of people living today. We hope readers grasp not only the impact of the slave trade but also the deep contributions enslaved Africans made to the history, economy, and culture of the Americas,” says Micheletti.

The findings appeared in the American Journal of Human Genetics.

Scientists sequence the genomes of six bat species for clues to their unique features

Myotis myotis (Greater mouse-eared bat), Credit: Olivier Farcy.

Bats are the only flying mammals in the animal kingdom — but that’s not all they’re known for. Bats have a number of quite extreme adaptations, such as echolocation, highly sensitive sensory perception, significant longevity for their size, resistance to cancer, and exceptional immunity to viral infections. In fact, the coronavirus that has caused the world to grind to a halt is believed to have evolved inside bats, before jumping into humans.

No doubt, bats are amazing creatures. Now, for the first time, researchers have sequenced the raw genetic material that contains the instructions for bats’ unique, superpower-like adaptations.

“Given these exquisite bat genomes, we can now better understand how bats tolerate viruses, slow down aging, and have evolved flight and echolocation. These genomes are the tools needed to identify the genetic solutions evolved in bats that ultimately could be harnessed to alleviate human aging and disease,” Emma Teeling, senior author of the new study and a researcher at the University College Dublin, said in a statement.

Teeling and colleagues affiliated with Bat1k, a global consortium of researchers on a mission to sequence the genomes of every one of the 14,210 living bat species, published a study today in which they describe the genomes of six bat species.

The genomes were highly accurately analyzed with state-of-the-art sequencing technology and are about 10 times more complete than any other bat genome published in the past.

“Using the latest DNA sequencing technologies and new computing methods for such data, we have 96-99% of each bat genome in chromosome level reconstructions – an unprecedented quality akin to for example the current human genome reference which is the result of over a decade of intensive “finishing” efforts. As such, these bat genomes provide a superb foundation for experimentation and evolutionary studies of bats’ fascinating abilities and physiological properties” Eugene Myers, senior author of the study and Director of Max Planck Institute of Molecular Cell Biology and Genetics, and the Center for Systems Biology, said in a statement.

The first six bat genomes that were sequenced part of the Bat1K global genome consortium belonged to the greater horseshoe bat (Rhinolophus ferrumequinum), the Egyptian fruit bat (Rousettus aegyptiacus), the pale spear-nosed bat (Phyllostomus discolor), the greater mouse-eared bat (Myotis myotis), the Kuhl’s pipistrelle (Pipistrellus kuhlii) and the velvety free-tailed bat (Molossus molossus). 

Their genetic blueprints were compared to 42 other mammals, which enabled the researchers to pinpoint the position of bats on the mammalian tree of life.

Rhinolophus ferrumequinum (Greater horseshoe bat), Credit: Daniel Whitby.

Due to their many unique quirks, the question of where bats fit in on the tree of life has always been unresolved. But using novel phylogenetic methods and molecular datasets, the evidence suggests that bats are most closely related to Ferreuungulata — a group of mammals that includes carnivores like dogs, cats, and seals, as well as pangolins, whales, and hoofed mammals. Not a very narrow definition seeing how bats and cows are on the same roster, but as more bat genomes are sequenced their taxonomy can be refined further.

The side-to-side comparison of different mammalian genomes also helped tease apart adaptations that are unique to bats through the loss and gain of certain genes.

For instance, the genes that enable bats’ famous echolocation were selected for in the ancestral branch of bats, suggesting this is an ancient trait in this group of mammals.

There was also evidence of gene loss and gain involved in immunity, particularly the expression of antiviral APOBEC3 genes. This may explain why bats have exceptional immunity that makes them extremely tolerant to viral infections.

In this day and age, understanding the molecular mechanisms that allow bats to withstand coronaviruses may lead to new approaches, therapies, and vaccines meant to increase human survivability in the face of COVID-19.

“Having such complete genomes allowed us to identify regulatory regions that control gene expression that are unique to bats. Importantly we were able to validate unique bat microRNAs in the lab to show their consequences for gene regulation. In the future we can use these genomes to understand how regulatory regions and epigenomics contributed to the extraordinary adaptations we see in bats.” Sonja Vernes, Co-Founding Director Bat 1K, Max Planck Institute for Psycholinguistics, Nijmegen, Senior Author

Although the researchers sequenced the genomes of only six bats, they’ve already learned quite a lot. However, this is merely the beginning — there are still more than 1,400 known bat species to go.

The findings appeared in the journal Nature.

Geneticists sequence the complete human X chromosome for the first time

For the first time, scientists have determined the complete sequence of a human chromosome, namely the X chromosome, from ‘telomere to telomere’. This is truly a complete sequencing of a human chromosome, with no gaps in the base pair read and at an unprecedented level of accuracy.

A step closer towards the complete blueprint of a human being

The Human Genome Project was a 13-year-long, publicly funded project initiated in 1990 with the objective of determining the DNA sequence of the entire human genome.

Although the project was met with initial skepticism by scientists and non-scientists alike, the overwhelming success of the Human Genome Project is readily apparent. Not only did it usher in a new era in medicine, but it also led to significant advances in DNA sequencing technology.

When the Human Genome Project was finished, its running costs tallied $2.7 billion of taxpayers’ money. Today, a human genome can be sequenced for less than $200 — that’s a 13.5-million-fold reduction in cost. And, it’s still going down.

However, despite its resounding success, the human genome sequencing is still incomplete, as still unknown regions of the genome could not be finished due to technical reasons.

These gaps in the genome have been gradually filled as technically improved after the Human Genome Project was officially over in 2003.

But, until last year, there were still 100 or so regions that were yet unknown. Now, some of these regions have been brought to light, helping to complete the sequencing of the human X chromosome.

The X chromosome is one of two sex-determining chromosomes passed down from parent to child. A zygote that receives two X chromosomes – one from each parent – will grow into a female, while an X and a Y chromosome result in a male.

According to Karen Miga, a research scientist at the UC Santa Cruz Genomics Institute, this was all possible thanks to new sequencing technologies that enable “ultra-long reads,” such as the nanopore sequencing technology.

In the initial stages of the Human Genome Project, scientists could read 500 bases at a time, or 500 letters per sequence. In the mid-2000s, the amount of DNA that could be read at a time was reduced (100-200 bases), but the accuracy of technology increased. Then around 2010, new technology came on the market that could read 1,000-10,000, and now more recently 100,000 or more bases at a time thanks to nanopore technology.

Nanopore tech involves funneling single molecules of DNA through a tiny hole. Changes in current flow determine the genetic sequencing.

“These repeat-rich sequences were once deemed intractable, but now we’ve made leaps and bounds in sequencing technology,” Miga said. “With nanopore sequencing, we get ultra-long reads of hundreds of thousands of base pairs that can span an entire repeat region, so that bypasses some of the challenges.”

The technique itself was very simple: simply collect as much of these bases that scientists could from a single cell line of interest.

“We chose a unique cell line that has two copies of every chromosome, just like any normal cell, but each of those copies is identical to one another. Rather than having to resolve the genome of two genomes, we only had a single version to worry about. Then you can grow these cell lines clonally, so you don’t have variation in them, and then sequence them on these instruments,” Dr. Adam Phillippy of the National Human Genome Research Institute said in a statement.

Scientists collected data over the course of six months, and then used algorithms to stitch the puzzle pieces back together again.

This is how they sequenced the centromere, a large repetitive bit of sequence that is centered in the middle of the X chromosome as its name might suggest, and a number of other genome arrays on the X chromosome.

This work opens up a range of new possibilities in research, including the prospect of identifying new associations between genetic sequence variation and disease, as well as new clues into human biology and evolution.

“We’re starting to find that some of these regions where there were gaps in the reference sequence are actually among the richest for variation in human populations, so we’ve been missing a lot of information that could be important to understanding human biology and disease,” Miga said in a statement.

The complete sequencing of the X chromosome signifies yet another massive victory for science. However, there are still 23 other chromosomes to go — all of them might be completely mapped out by the end of this year, the researchers said.

The findings appeared in the journal Nature.

Researchers encode “The Wizard of Oz” in DNA with unprecedented accuracy and efficiency

Credit: The Wizard of Oz (1939).

DNA is ridiculously good at storing information. One milliliter droplet of DNA can theoretically store as much information as two Walmarts full of data servers. What’s more, DNA can be stored at room temperature for hundreds of thousands of years. If your gears are turning right now, you’re not alone.

However, using DNA to store information is not at all as straightforward as storing it on a flash drive. In fact, it can be a nightmare to encode and decode information from the blueprint of life — but science is making progress in strides.

In a new study, researchers at the University of Texas have employed a new technique for storing and reading information encoded in the iconic double-helix “twisted ladder”.

The researchers demonstrated their novel technique by encoding the entire book of “The Wizard of Oz”, translated into Esperanto, with unprecedented accuracy and efficiency.

“The key breakthrough is an encoding algorithm that allows accurate retrieval of the information even when the DNA strands are partially damaged during storage,” said Ilya Finkelstein, an associate professor of molecular biosciences and one of the authors of the study.

DNA: 5 million times more efficient than any storage medium employed today

Every cell in our bodies and even instincts are encoded in base sequences of adenine (A), thymine (T), guanine (G), and cytosine (C) — DNA’s four nucleotide bases. Ever since DNA was first discovered in the 1950s by James Watson and Francis Crick (and the largely uncredited Rosalind Franklin) scientists quickly realized that huge quantities of data could be stored at high density in only a few molecules.

Just one gram of DNA is enough to store the entirety of all human knowledge, which is why some are keen on using the blueprint of life as the ultimate time capsule.

Additionally, DNA can be stable for a long time as a recent study showed, when researchers recovered DNA from 430,000-year-old human ancestor found in a cave in Spain.

For years, scientists have been storing all sorts of information in DNA, particularly during the previous decade. In 2017, researchers at the New York Genome Center (NYGC) stored a full computer operating system, an 1895 French film, “Arrival of a train at La Ciotat,” a $50 Amazon gift card, a computer virus, a Pioneer plaque and a 1948 study by information theorist Claude Shannon into 72,000 DNA strands each 200 bases long.

However, we’re still a long way from using DNA as a reliable storage medium. For one, synthesizing and reading DNA is prohibitively expensive.

The biggest impediment, however, is the fact that DNA is highly prone to errors.

Unlike malfunctioning computer code, which tends to show up as blanks, errors in DNA sequences appear as insertions or deletions. This can cause a huge predicament since such errors shift the whole sequence, with no blank spaces to alert us.

In order to account for inherent errors in DNA, researchers had to repeat a piece of information 10 to 15 times. These repetitions can be compared to track insertions or deletions.

But due to the way the team at the University of Texas chose to store information, there is no need for repetitions.

“We found a way to build the information more like a lattice,” said Stephen Jones, a research scientist who collaborated on the project with Finkelstein. “Each piece of information reinforces other pieces of information. That way, it only needs to be read once.”

To demonstrate the reliability of their method, Finkelstein’s team of researchers encoded the Wizard of Oz into DNA, which they then subjected to high temperature and extreme humidity.

Naturally, the DNA strands became damaged, but all the information was read successfully. This marks a huge leap in the long road to DNA storage of information.

“We tried to tackle as many problems with the process as we could at the same time,” said John Hawkins, co-author of the new study and a Ph.D. alumnus of the Oden Institute for Computational Engineering and Sciences at the University of Texas.

“What we ended up with is pretty remarkable.”

The method was described in the Proceedings of the National Academy of Sciences.

Researchers dive deep into the genetic legacy of the transatlantic slave trade

Print showing an alleged incident of an enslaved African girl whipped to death for refusing to dance naked on the deck of the slave ship Recovery, a slaver owned by Bristol merchants. Captain John Kimber was denounced before the House of Commons by William Wilberforce over the incident. In response to outrage by abolitionists, Captain Kimber was brought up on charges before the High Court of Admiralty in June 1792, but acquitted of all charges. Credit: United States Library of Congress’s Prints and Photographs division.  

Researchers in Brazil combined historical and genetic data to reveal new insights about the transatlantic slave trade that saw more than 9 million Africans shipped in chains to the Americans from the early 16th century until the mid-19th century. The findings suggest that the African populations imported their genetic diversity and spread their mutations in the Americas through admixture with indigenous and European populations.

“We know in the Americas that the slave trade was a human tragedy, but it is part of our history and identity. This is why my group, but mainly myself and my former PhD student Mateus Gouveia focused in the African Diaspora,” Eduardo Tarazona-Santos, a researcher at the Federal University of Minas Gerais in Brazil and lead author of the new study, told ZME Science.

African populations are the most diverse in the world, genetically speaking. Tarazona worked closely with colleagues in Brazil, Peru, and the United States to assemble what he calls the “largest up-to-date dataset of Americas and African genetic data”, which includes 6,267 individuals with more than 10% African ancestry from 25 populations.

Researchers compared the genetic data with historical demographic data from Slave Voyages database, which tracked and mapped the dispersal of enslaved Africa into the Americas.

“We came out with a mathematical method that makes this comparison compatible. Then we realized that comparing genetic and historical-demography data is something modern geneticists had forgotten to do during the last 10-20 years, but it this kind of comparisons were more common before and have a solid tradition in human population genetics, since the work by Luca Cavalli-Sforza (who passed away in 2018) sixty years ago in the Parma Valley in Italy, where he compared genetic data (from blood groups) with parish record data. So recovering this kind of work, is like making a tribute to Luca Cavalli-Sforza. Reading his books has been an inspiration for many young investigators that in the nineties decided to dedicate to human population genetics, as I did,” Tarazona said.

The Transatlantic Slave Trade transported more than 9 million Africans to the Americas between the early 16th and the mid-19th centuries. Credit: Eduardo Tarazona-Santos, of the Federal University of Minas Gerais in Brazil.

The researchers found that West Central African ancestry (from countries such as Nigeria and Ghana) is the most common in the Americas. West African ancestry (i.e. Senegal and Gambia) increases going northward while bantu ancestry (from south and southeast Africa) is more significant in the South of Brazil.

Historical records show that the transatlantic slave trade was at its height between 1750 and 1850. The new study found that this period also coincides with the most admixture between imported African populations and locals of European and indigenous ancestry. This timing implies that the 19th century was critical in shaping the structure of the African gene pool in the New World.

“The African Diaspora was so massive (>9 million people), that the genetic diversity observed in the African portions of our admixed genomes is similar to that of African populations of origin of slavery. However, admixture homogenized this diversity (and the mutations responsible for diseases) between the different populations of the African continent,” Tarazona told ZME.

All in all, the study provides unique insights into the gene flow caused by the massive transatlantic slave trade, whose influence is still important in today’s social and cultural setting in the Americas.

“Our results imply that the Africans imported most of their genetic diversity, including the mutations responsible for the diseases, and that admixture has spread these mutations in the Americas along most of the continent. In Africa, they are more compartmentalized geographically. This is important when we interpret data about where there are in the Americas mutations responsible for diseases such as cystic fibrosis and hereditary cancer,” Tarazona concluded.

The findings appeared on March 2 in the journal Molecular Biology and Evolution.

Ancient Dane’s life reconstructed from 5,700-year-old chewing gum

Artist interpretation of “Lola”, an Early Neolithic hunter-gatherer female who lived in Denmark. Credit: Tom Björklund.

While agriculture was spreading through many parts of Europe, communities of hunter-gatherers in Denmark still practiced their ancient lifestyles. This is what the life of “Lola”, a Neolithic Dane with dark skin, blue eyes, and dark hair, seems to suggest. Remarkably, information about Lola’s appearance, diet, lifestyle, and even medical history was not extracted from her remains — those were never found — but rather from a perfectly preserved 5,700-year-old “chewing gum”.

The ancient chewing gum is actually a piece of birch tar, a sticky substance that was primarily employed as a glue by Middle Pleistocene communities. However, early humans likely used the birch tar for other purposes. People would likely chew on the birch to give it malleability prior to employing the substance in tool manufacturing. They might have also chewed it for medical purposes, to soothe toothaches, suppress hunger, or simply because they liked the feeling as modern humans use chewing gum.

The ancient birch pitch with ruler for scale. Credit: Theis Jensen.

This particular piece of birch tar, which was recovered from a site in southern Denmark, was found sealed in mud. The substance was already primed for preservation thanks to its hydrophobic (water-repellant) properties, but the local environment helped protect the chewed substance from the elements.

“Almost everything is sealed in mud, which means that the preservation of organic remains is absolutely phenomenal,” said Tehis Jensen, co-author of the new study.

The pristine preservation of the sample allowed researchers at the University of Copenhagen to sequence the full genome of the person who last chewed on it. Not only that, they also extracted genetic information about the oral bacteria that inhabited Lola’s mouth, as well as information about her diet.

“It is the first time that an entire ancient human genome has been extracted from anything other than human bones,” Hannes Schroeder of the University of Copenhagen, told AFP.

Although her age could not be determined, the excellently preserved genome showed that the Neolithic female had dark hair, dark skin, and blue eyes.

These features were common among foragers in continental Europe. In fact, the genome traces Lola’s lineage to mainland Europe and not central Scandinavia. And, since remains on the ancient chewing gum contain duck and hazelnuts, Lola was likely a forager, too, despite the fact she lived during the Early Neolithic when agriculture was already established around Europe, particularly south of the Danube.

Lola was also lactose intolerant, fitting the narrative that lactase persistence only appeared in adults fairly recently after the introduction of dairy farming. This shows that the region where the birch was found may have been quite late in adopting agriculture.

The birch pitch also contained microbial DNA. Most of these organisms were harmless, but the researchers also identified a bacterium linked to gum disease, as well as DNA associated with pneumonia and a virus that causes mononucleosis (glandular fever).

All of these insights were gleaned from an unsuspecting piece of very old gum. Sounds like a good day for science!

The findings appeared in the journal Nature Communications.

New genetic research effort aims to make watermelons tastier, more resilient

If you like watermelons, this team has big news for you.

Image credits Aline Ponce.

A new research effort aims to pave the way towards new and improved watermelons. The study took a comprehensive look at the genomes of all seven watermelon species to create a database that plant breeders can use to produce tastier, plumper, and more resistant watermelons.

The Better Melon

“As humans domesticated watermelon over the past 4,000 years, they selected fruit that were red, sweet and less bitter,” said Zhangjun Fei, a faculty member at Boyce Thompson Institute and co-leader of the international effort.

“Unfortunately, as people made watermelons sweeter and redder, the fruit lost some abilities to resist diseases and other types of stresses.”

Back in 2013, Fei co-led the creation of the first watermelon reference genome. This database was built from an East Asian cultivated variety ‘97103’. That variety, and likely the watermelon you’re imagining right now belongs to the Citrullus lanatus species, i.e. the sweet fruit with a juicy red interior.

However, Fei explains that there are six other wild species of watermelon that have pale, hard, bitter fruits, but possess other desirable qualities — such as a higher resilience against man-made climate change. Introducing the genes that generate such qualities into cultivated watermelon varieties can help make the fruits tastier, better able to grow in diverse climates, as well as more resistant to pests, diseases, and other factors. But, in order for us to get there, we first need to know which genes these are.

In order to find out, the team started with the reference genome Fei worked on in 2013, and created an improved version. The previous work relied on short-read sequencing technologies, Fei explains, while the newer one uses long-read sequencing technologies, allowing for “a much higher quality genome that will be a much better reference for the watermelon community.”

Next, the group sequenced the genomes of 414 watermelons across all seven species. By comparing these genomes both to the new reference genome and to each other, they were able to determine the evolutionary relationship of the different watermelon species.

“One major discovery from our analysis is that one wild species that is widely used in current breeding programs, C. amarus, is a sister species and not an ancestor as was widely believed,” Fei said.

Modern watermelon cultivars were domesticated by breeding out the fruits’ bitterness while increasing their sweetness, size, and reddening their flesh. Over the past few hundred years, the fruits kept becoming sweeter, but also improved in regards to flavor and crispiness of texture. The team identified several regions of the watermelon genome that could be leveraged to continue improving these qualities in cultivars.

“The sweet watermelon has a very narrow genetic base,” says Amnon Levi, a research geneticist and watermelon breeder at that U.S. Department of Agriculture, one of the study’s co-authors. “But there is wide genetic diversity among the wild species, which gives them great potential to contain genes that provide them tolerance to pests and environmental stresses.”

The team also published an accompanying paper analyzing 1,175 melons, including cantaloupe and honeydew varieties. The researchers found 208 genomic regions that were associated with fruit mass, quality, and morphological characteristics, which could be useful for melon breeding.

The paper “Resequencing of 414 cultivated and wild watermelon accessions identifies selection for fruit quality traits” has been published in the journal Nature Genetics.



Researchers make chicken cells resist bird flu by snipping out a tiny bit of their DNA

Designer chicken cells grown in the lab at Imperial College London can resist the spread of bird flu.


Image credits Samet Uçaner.

Bird flu, as its name suggests, is mostly concerned with infecting birds. And it’s quite good at it: severe strains of bird flu can completely wipe out a whole flock. In rare cases, the virus can even mutate to infect humans, causing serious illness. As such, bird flu is a well-known and scary pathogen in the public’s eye.

Now, researchers from Imperial College London and the University of Edinburgh’s Roslin Institute have devised chicken cells that can resist infection with the bird flu virus. Their efforts pave the way towards effective control of the disease, safeguarding one of the most important domesticated animals of today.

Be-gone, flu

“We have long known that chickens are a reservoir for flu viruses that might spark the next pandemic. In this research, we have identified the smallest possible genetic change we can make to chickens that can help to stop the virus taking hold,” says Professor Wendy Barclay, Chair in Influenza Virology at Imperial College London and the paper’s corresponding author. “This has the potential to stop the next flu pandemic at its source.”

The findings could make it possible to immunize chickens to the virus using a simple genetic modification. No such chickens have been produced just yet, but the team is confident that their method will prove safe, effective, and palatable with the public in the long run.

The approach involves a specific molecule found in chicken cells, called ANP32A. Researchers at Imperial report that during a bird flu infection, viruses use this molecule to replicate (multiply) and continue attacking the host. The researchers from the University of Edinburgh’s Roslin Institute worked to gene-edit chicken cells to remove a portion of DNA that encodes the production of ANP32A.

With this little tweak, the team reports, the virus was no longer able to replicate inside the cells.

Members at The Roslin Institute have previously worked on something similar. Teaming up with researchers from Cambridge University at the time, they successfully produced gene-edited chickens that didn’t transmit bird flu to other chickens following infection. However, the approach they used at the time involved adding new genetic sequences into the birds’ DNA; while the proof-of-concept was very encouraging, the approach didn’t seem to stick, commercially.

“This is an important advance that suggests we may be able to use gene-editing techniques to produce chickens that are resistant to bird flu,” says Dr. Mike McGrew, of the University of Edinburgh’s Roslin Institute and a paper co-author.

“We haven’t produced any birds yet and we need to check if the DNA change has any other effects on the bird cells before we can take this next step.”

The paper “Species specific differences in use of ANP32 proteins by influenza A virus” has been published in the journal eLife.


Team sequences the pan-genome of tomatoes in a bid to make them tasty again

Researchers at the Agricultural Research Service (ARS) and the Boyce Thompson Institute (BTI) want to bring back the tasty tomato of yore.


Image credits Mauro Borghesi.

Sadly, it seems that store-bought tomatoes just aren’t very tasty. An international research team thinks they have the way to fix this tasteless problem, though. They have finished constructing the pan-genome for the cultivated tomato and its wild relatives, mapping almost 5,000 previously undocumented genes. Armed with this knowledge, researchers might be able to bring the flavor back.

They don’t make them like they used to

“These novel genes discovered from the tomato pan-genome added substantial information to the tomato genome repertoire and provide additional opportunities for tomato improvement,” says co-author Zhangjun Fei, a bioinformatics scientist at the Boyce Thompson Institute.

“The presence and absence profiles of these genes in different tomato populations have shed important lights on how human selection of desired traits have reshaped the tomato genomes.”

A genome is the map of an organism’s genes and their functions. Genomes are, unsurprisingly, sequenced for individual organisms, and these are in turn used to create a kind of reference genome for the rest of the species. The team’s pan-genome, on the other hand, includes all of the genes from 725 different cultivated and closely related wild tomatoes, which revealed 4,873 genes that were absent from the original reference genome.

So what seems to be the problem with our tomatoes? Where’s the taste? The team reports that cultivated tomatoes show a wide range of physical and metabolic variation but, by and large, they’ve all been through several severe bottlenecks during their domestication and later breeding. In effect, this means that today’s tomatoes aren’t very genetically diverse.

Modern breeders, the team explains, have focused on traits such as yield, shelf life, disease resistance, and stress tolerance, which are economically important to growers. However, the pan-genome does point to a few genes we can use to improve the flavor, too.

“One of the most important discoveries from constructing this pan-genome is a rare form of a gene labeled TomLoxC, which mostly differs in the version of its DNA gene promoter,” explained James Giovannoni, a molecular biologist at the Agricultural Research Service (ARS) and paper co-author.

“The gene influences fruit flavor by catalyzing the biosynthesis of a number of lipid (fat)-involved volatiles–compounds that evaporate easily and contribute to aroma.”

TomLoxC also facilitates the production of apocarotenoids — a class of organic chemicals derived from carotenoids including vitamin A precursors — which function as signaling molecules for various responses in plants, including environmental stresses. The compounds also have a variety of floral and fruity odors that are important in tomato taste, the team notes.

The rarer version of TomLoxC was found in only 2% of older or heirloom varieties of large tomato. The common version was present in 91% of currant-sized wild tomatoes, primarily Solanum pimpinellifolium, the wild predecessor of the cultivated tomato. It is becoming more common in newer varieties.

“It appears that there may have been strong selection pressure against or at least no selection for the presence of this version of TomLoxC early in the domestication of tomatoes,” Giovannoni added. “The increase in prevalence of this form in modern tomatoes likely reflects breeders’ renewed interest in improved flavor.”

The team says that with the pan-genome in hand, breeders should be able to quickly increase the flavor of mass-produced tomatoes without sacrificing the traits that make them so economically-viable.

“These novel genes discovered from the tomato pan-genome added substantial information to the tomato genome repertoire and provide additional opportunities for tomato improvement. The presence and absence profiles of these genes in different tomato populations have shed important lights on how human selection of desired traits have reshaped the tomato genomes,” said Fei.

The team also expects that the nearly new tomato 5,000 genes they’ve identified in the pan-genome will help breeders improve it in further ways. Tomatoes, although they are fruits, botanically, are one of the most eaten vegetables worldwide, with a total annual production of 182 million tons (worth more than $60 billion). In the U.S., tomatoes are the second-most consumed vegetable after potatoes. Each American eats an average of 20.3 pounds of fresh tomatoes and an additional 73.3 pounds of processed tomatoes per year (estimated based on 2017 figures).

The paper “The tomato pan-genome uncovers new genes and a rare allele regulating fruit flavor” has been published in the journal Nature Genetics.

California redwoods. Credit: CBS This Morning.

Scientists sequence genomes of world’s tallest trees

Coast redwoods and giant sequoia trees are California’s oldest residents, some being more than 2,000 years old. These magnificently tall trees have had their genomes sequenced for the first time, a major breakthrough that scientists claim will help preserve them for generations from the perils of disease and climate change.

California redwoods. Credit: CBS This Morning.

California redwoods. Credit: CBS This Morning.

The $2.6 million Redwood Genome Project, which first began in 2017, is the culmination of state-of-the-art genetic research and the most extensive genetic study ever done on primeval forests. David Neale, a University of California Davis plant scientist who led the project, along with colleagues at Johns Hopkins University and the Save the Redwoods League, used a supercomputer to analyze the DNA extracted from tissues taken from a coast redwood tree (Sequoia sempervirens) in Butano State Park and a giant sequoia tree (Sequoiadendron giganteum) from Sequoia Kings Canyon National Park.

Amazingly, the researchers found that the two species had some of the largest genomes known so far. The coast redwood genome has 6 sets of chromosomes and 27 billion base pairs of DNA, compared to only two sets of chromosomes and nine times fewer base pairs in humans. The giant sequoia has a more modest genome with 8 billion base pairs, but that’s still three times larger than the human genome.

“These narrow endemics play important roles in ecology, economy, culture, and conservation. Although redwoods have been around for millions of years, we know very little about how these trees evolved to occupy their current range,” Neale wrote on the project’s website.

The largest known genome belongs to the axolotl, a North American salamander, which numbers 28 billion base pairs. Researchers believe that its rich genome is what allows the salamander to not only regenerate limbs but also grow back internal organs.

It makes sense for a tree such as a redwood to have a complex genome. These trees can grow in the same place for thousands of years so they require a robust ability to fight off fungi, insects, and significant swings in temperature and humidity throughout their lifetime.

“We’re trying to build a 23andMe for trees, where a manager sends in their samples and gets a risk evaluation of their forest populations, if not individual trees,” Neale said in a statement. “Completing the sequences of the coast redwood and giant sequoia genomes is the first step.”

Sequencing the genomes of the world’s tallest trees, which can reach higher heights than the Statue of Liberty, is paramount to their conservation. Old-growth forests used to grow from the Sierra Nevada range and along the California coast all the way to the Oregon border. Sadly, loggers have cut down more than 95% of these forests since 1850. The few remaining forests have been granted special status and are protected in national parks, however, they are still threatened by climate change.

Ultimately, the project aims to develop genetic variation models for the various groves of old growth. In the future, it might be possible for a forest manager to send no more than a tree’s leaf to a specialized lab and get back a report on the trees and their vulnerability to drought and variations in temperature. This way, they can then make restoration decisions based on genetic diversity. This process to identify flaws in the trees is not all that different from the one that led to new cures for diseases like sickle cell anemia after the human genome was first sequenced in 2000.

“Every time we plant a seedling or thin a redwood stand to reduce fuel loads or accelerate growth, we potentially affect the genomic diversity of the forest,” said Emily Burns, director of science for Save the Redwoods League. “With the new genome tools we’re developing now, we will soon be able to see the hidden genomic diversity in the forest for the first time and design local conservation strategies that promote natural genomic diversity. This is a gift of resilience we can give our iconic redwood forests for the future.”

The Caulobacter ethensis-2.0 genome in a micro tube. Credit: ETH Zurich.

Scientists present first computer-generated artificial genome

The Caulobacter ethensis-2.0 genome in a micro tube. Credit: ETH Zurich.

The Caulobacter ethensis-2.0 genome in a micro tube. Credit: ETH Zurich.

Researchers at ETH Zurich have demonstrated a new method for producing genomes that is cheaper and faster than ever before. The authors produced the first fully computer-generated genome — the Caulobacter ethensis-2.0 for which a corresponding organism does not yet exist. However, the genome was physically produced and inserted into an existing organism with similar genetic material.

Computers and synthetic life

In 2010, researchers at the J. Craig Venter Institute reported a landmark advancement in synthetic biology: the first bacterial DNA engineered from scratch. This was the culmination of more than a decade of hard work and a $40 million investment. But while the bacterial genome made by Craig Venter was an exact copy of a natural genome, researchers at ETH Zurich radically altered the genome of a model organism called Caulobacter crescentus. What’s more, their research spanned a time frame of a year and cost les than half a million dollars, which shows that synthetic biology is on the brink of a revolution.

Caulobacter crescentus is a harmless freshwater bacterium whose genome has been extensively studied. Previous research showed that out of the bacterium’s 4,000 genes only about 680 were crucial to the survival of the organism in laboratory conditions. Beat Christen, Professor of Experimental Systems Biology at ETH Zurich, and his brother, Matthias Christen, a chemist at ETH Zurich, used this minimum set of crucial genes as a starting point.

The researchers designed a computer algorithm that scanned this minimally viable natural genome and computed the ideal DNA sequence for the synthesis and construction of the genome. The algorithm replaced a sixth of all the 800,000 DNA letters found in the minimal genome.

“Through our algorithm, we have completely rewritten our genome into a new sequence of DNA letters that no longer resembles the original sequence. However, the biological function at the protein level remains the same,” says Beat Christen.

The hard part was only just beginning. Next, the researchers had to produce a DNA molecule which contained the artificial bacterial genome, and this had to be done step by step. The researchers synthesized 236 separate genome segments, which they then had to delicately piece together.

“The synthesis of these segments is not always easy,” explains Matthias Christen. “DNA molecules not only possess the ability to stick to other DNA molecules, but depending on the sequence, they can also twist themselves into loops and knots, which can hamper the production process or render manufacturing impossible,” explains Matthias Christen.

An electron microscope image of Caulobacter crescentus, a harmless bacterium living in fresh water. Credit: ETH Zurich.

An electron microscope image of Caulobacter crescentus, a harmless bacterium living in fresh water. Credit: ETH Zurich.

As an experiment, the researchers produced strains of bacteria in the lab that contained the naturally occurring Caulobacter genome as well as segments of the new artificial genome. By switching off certain natural genes in the bacteria, the researchers were able to test the functions of the artificial genes introduced earlier. The rewritten genome was designed by an algorithm which could only parse information that was understood at the time of the DNA sequence. Naturally, there are also DNA sequences that have yet to be understood by scientists, and this can be lost in the process of creating the new code.

“Our method is a litmus test to see whether we biologists have correctly understood genetics, and it allows us to highlight possible gaps in our knowledge,” explains Beat Christen.

These experiments showed that only 580 of the 680 artificial genes were functional, showing that the algorithm needs tweaking before researchers can hope to achieve a truly functional genome in version 3.0.

Even though this version isn’t perfect, the new study demonstrates how modern technology can streamline artificial DNA synthesis. And who knows: in the future scientists might finally create synthetic organisms that would serve a wide array of biotech applications. For instance, custom-made bacteria could be used to produce active molecule for drugs or DNA vaccines.

“We believe that it will also soon be possible to produce functional bacterial cells with such a genome,” says Beat Christen.

“As promising as the research results and possible applications may be, they demand a profound discussion in society about the purposes for which this technology can be used and, at the same time, about how abuses can be prevented,” he added. It is still not clear when the first bacterium with an artificial genome will be produced – but it is now clear that it can and will be developed. “We must use the time we have for intensive discussions among scientists, and also in society as a whole. We stand ready to contribute to that discussion, with all of the know-how we possess.”

The findings were reported in the Proceedings of the National Academy of Sciences

Credit: Wikimedia Commons.

Great white shark genome might teach us how to heal faster or stave off cancer

Credit: Wikimedia Commons.

Credit: Wikimedia Commons.

Great whites are some of the most recognizable marine species. Our fascination for these majestic, but also fearsome creatures deepens now that scientists have completed the first genome sequencing of the iconic apex predator.

Scientists sink their teeth in the great white’s genome

The great white’s genome was decoded by an international team of researchers, including those at the Nova Southeastern University’s (NSU) Save Our Seas Foundation Shark Research Center, Guy Harvey Research Institute (GHRI), Cornell University College of Veterinary Medicine, and Monterey Bay Aquarium.

“Decoding the white shark genome is providing science with a new set of keys to unlock lingering mysteries about these feared and misunderstood predators – why sharks have thrived for some 500 million years, longer than almost any vertebrate on earth” said Dr. Salvador Jorgensen, a Senior Research Scientist at the Monterey Bay Aquarium, who co-authored the study.

According to the results, the great white genome contains one-a-half times more information than the human genome. That was not surprising to learn, given that they have 41 pairs of chromosomes, whereas humans have only 23.

There’s no doubt that great whites (Carcharodon carcharias) have experienced tremendous evolutionary success. They’re found throughout most of the world’s oceans, grow up to half the length of a bus, have more than 300 razor-sharp, triangular teeth arranged in seven rows, can detect a seal from two miles away, and are the top of the food chain. Their only threat is humans, whose overfishing and illegal hunting have caused the great white shark to be listed as a vulnerable species on the IUCN Red List.

Not only can great white grow to a large size, but they also have a long lifespan, easily reaching 70 years in the wild. But, despite their size and lifespan, the predators rarely get cancer. Previously, research had established a linear relationship between an animal’s body size and the incidence of cancer, but the great white seems to be one of those rare exceptions. The new study suggests that this is partly due to the great white’s genome stability — genetic adaptations which help preserve its genome.

Another remarkable feature of great whites is their extraordinary ability to regenerate quickly. Researchers have tracked back this ability to certain genes that are tied to fundamental pathways involved in wound healing, including a key blood clotting gene.

“Not only were there a surprisingly high number of genome stability genes that contained these adaptive changes, but there was also an enrichment of several of these genes, highlighting the importance of this genetic fine-tuning in the white shark,” said Mahmood Shivji, who is the director of NSU’s Save Our Seas Foundation Shark Research Center.

“Genome instability is a very important issue in many serious human diseases; now we find that nature has developed clever strategies to maintain the stability of genomes in these large-bodied, long-lived sharks,” said Shivji. “There’s still tons to be learned from these evolutionary marvels, including information that will potentially be useful to fight cancer and age-related diseases, and improve wound healing treatments in humans, as we uncover how these animals do it.”

Decoding the white shark’s genome is a great breakthrough that will help conserve the species. For instance, the genome data could be used to better assess white population dynamics. The insight gained from the great white’s genome might also lead to novel cancer drugs in the future.

The findings were reported in the journal PNAS. 

Butterflies are genetically wired to mate with others like them

Male butterflies take a particular liking to females which look just like them, researchers found.

Heliconius melpomene malleti feeding on a Gurania flower. Image credits: Chris Jiggins.

Butterflies are weird creatures — there, I’ve said it. Their very existence is tied to one of the most bizarre processes in the natural world (metamorphosis), they have crazy long tongues, and they evolved at least 55 million years ago — making them much, much older than mankind. They are also beautiful creatures with remarkably colorful wings which have been admired by mankind since the dawn of our civilization.

But, for all our admiration, there’s still much we don’t know about them — particularly at the genetic level. In order to address that, researchers from the University of Cambridge, in collaboration with the Smithsonian Tropical Research Institute in Panama, observed the courtship rituals of two Colombian species of Heliconius  — a colorful and widespread genus commonly known as longwings. The team also sequenced the DNA from nearly 300 butterflies to find out how much of the genome was responsible for their mating behavior. Their results brought forth a few surprises. Professor Chris Jiggins, one of the lead authors on the paper and a Fellow of St John’s College, explains:

“There has previously been lots of research done on finding genes for things like colour patterns on the butterfly wing, but it’s been more difficult to locate the genes that underlie changes in behaviour.

“What we found was surprisingly simple – three regions of the genome explain a lot of their behaviours. There’s a small region of the genome that has some very big effects.”

Heliconius melpomene rosina feeding on a Gurania flower. Image credits: Chris Jiggins

Unlike most butterflies, which use chemical signals to find a mate, Heliconian males use their long-range vision to locate females — which also explains why they have distinctive wing markings. Researchers took advantage of this fact and carried out another experiment, introducing male butterflies of one species to females from both species. They then followed the males, noting their levels of sexual interests towards each of them (yes, for science).

They found that males would most often choose females with similar wing markings — again, a rather surprising fact. Dr. Richard Merrill, one of the authors of the paper, based at Ludwig-Maximilians-Universität, Munich, said:

“It explains why hybrid butterflies are so rare — there is a strong genetic preference for similar partners which mostly stops inter-species breeding. This genetic structure promotes long-term evolution of new species by reducing intermixing with others.”

Researchers also published a second paper on the subject, reporting that although hybrids are very rare, there is a surprisingly large amount of DNA shared between both species, DNA that has been shared through hybridization — ten times more than Neanderthals and humans share, for instance. The reason for this, researchers suspect, is that the lifespan of butterflies is shorter than that of humans, which allows for a much higher number of generations over the same period.

“Over a million years a very small number of hybrids in a generation is enough to significantly reshape the genomes of the these butterflies,” says Simon Martin, another one of the authors.

But despite this genetic mixing, the two species retain different behaviors and have not become blended. The part of the genome that defines the sex of the butterflies is protected from the effects of inter-species mating, but more importantly, their genome is tweaked and shaped by natural selection and cultural preferences, which allow species to remain distinct and unique.

Professor Jiggins says that ultimately, this type of study suggests that humans are not as unique as we used to think.

“In terms of behaviour, humans are unique in their capacity for learning and cultural changes but our behaviour is also influenced by our genes. Studies of simpler organisms such as butterflies can shed light on how our own behaviour has evolved. Some of the patterns of gene sharing we see between the butterflies have also been documented in comparisons of the human and Neanderthal genomes, so there is another link to our own evolution,” he concludes.

The two papers have been published in PLoS Biology and are freely available:

Credit: Posth et al./Cell.

Ancient DNA reveals two previously unknown migrations into South America

Credit: Posth et al./Cell.

Credit: Posth et al./Cell.

Scientists analyzed the ancient DNA of individuals who lived in Central and South America up to 10,000 years ago and found that these regions were settled by at least three waves of migration. The studies paint a rich and diverse history of the Americas, suggesting that the people who formed these migratory waves branched out of a single population that crossed the Bering Strait into North America about 15,000 years ago.

“Our work multiplied the number of ancient genomes available from these areas by about 20, giving us a much more comprehensive picture of indigenous history in the Americas,” co-senior author David Reich, a geneticist at Harvard Medical School and the Howard Hughes Medical Institute, said in a statement. “This broader dataset reveals a common origin of North, Central, and South Americans as well as two previously unknown genetic exchanges between North and South America.”

The DNA collected from 49 individuals who lived in Belize, Brazil, the Central Andes, and southern South America shows that they all originate from the same ancestral population that colonized North America. In and of itself, this fact is not particularly remarkable because scientists have always known that Central and South America were peopled by a migration that moved southward. However, what was truly surprising about the findings of three new ancient DNA studies, all published this week (Cell, ScienceScience Advances), was that there were multiple distinct migratory movements — some that mixed, others that formed new lineages.

Archaeologists believe that the Clovis people were the first to pass through the land bridge between Siberia and Alaska, which is now underwater, settling in the lower 48 states some 13,000 years ago. The Clovis culture was named after flint spearheads found in the 1930s at a site in Clovis, New Mexico. These mammoth-hunting people are considered to be the ancestors of most of the indigenous cultures of the Americas. Now, the new genomic analysis has yielded fresh insights into how Clovis people may have spread across the Americas.

Researchers compared the genome of a Clovis toddler who lived in Montana about 12,700 years ago to the earliest genome analyzed from South and Central America dating to between 9,000 and 11,000 years ago. The analysis revealed a common ancestry between the remains found in Montana and Lagoa Santa in Brazil, which suggests that the Clovis made a major impact much further south. Previously, anthropologists believed that the people at Lagoa Santa originated from a separate migration from Asia.

“We weren’t expecting to find a relation to people associated with the Clovis culture in South America,” says co-first author Nathan Nakatsuka from Harvard. “But it seems the expansion of the Clovis-associated lineage extended to parts of Central and South America.”

From around 9,000 years ago, however, the Clovis culture-associated ancestry completely disappeared in Peru. We don’t know what was the cause of such a dramatic large-scale population replacement but what seems certain is that the region was populated by a separate wave of migration, which showed remarkable continuity compared to Eurasia and Africa.

“There is remarkable continuity between earlier and later skeletons with South Americans today,” said Cosimo Posth, an archaeogeneticist from the Max Planck Institute for the Science of Human History. “For example, modern-day Quechua and Aymara from the Central Andes can trace their ancestry back to the ancient people of the Cuncaicha site from 9,000 years ago onwards. This is a longer-standing continuity than you see in other continents.”

The big question right now is why the branching occurred so fast. What seems certain is that the narrative of humanity’s distribution across the Americas is far more complex than meets the eye.

“We’re very enthusiastic about the prospects for a much richer understanding of American population history, but this is still a vast region full of geographic and chronological holes,” says Reich. “We’d like to collect more genetic material from earlier and later sites and from more countries, such as Colombia, Venezuela, and other parts of Brazil. We also want to examine the evolution of genetic traits over time.”

Credit: Pixabay.

Ancient retrovirus may make some people more prone to addiction

Credit: Pixabay.

Credit: Pixabay.

Substance abuse is on the rise in the United States, claiming tens of thousands of lives each year. Despite a burgeoning rehab industry and billions of dollars dedicated to research, the underlying causes of drug dependency are still poorly understood. For instance, we don’t know what makes some people more vulnerable to addiction than others.

An international team of researchers recently published a study that suggests the answer to this question may be buried deep in our genetic fabric. According to the findings, an ancient retrovirus present in a higher proportion among people battling drug addiction may be evidence of a physical cause of addiction.

Our dark genes

Although many retroviruses went extinct hundreds of thousands or millions of years ago, they still live on in our DNA. Retroviruses infect cells and replicate by inserting their DNA into their host cell’s genome. Sometimes that cell can be a germ cell, such as a sperm or egg, so the retroviral DNA is inherited by offspring just like a normal gene. Scientists call these elements human endogenous retroviruses (HERVs).

Scientists estimate that up to eight percent of human DNA is made up of retroviral sequences.

Researchers from several institutions, including Oxford University and the National-Kapodistrian University of Athens, studied people who injected drugs in Greece and Scotland. After a basic genetic screening of the study’s participants, the researchers found that drug users were about three times more likely to have remnants of the HK2 retrovirus within a particular gene in their DNA than people who didn’t use drugs. The virus was identified in 34% of drug users tested in Glasgow, Scotland, compared to 9.5% of the local population, and in 14% of Greek patients, compared to 6% of that country’s population.

The HK2 integration is present in only 5-10% of the population, where it may affect the RASGRF2 gene, which is involved in regulating the brain’s dopamine levels. The neurotransmitter dopamine helps control the brain’s reward and pleasure centers but is also involved in addictive behavior when it’s generated in high amounts as a result of drug use.

Earlier, in 2012, scientists had linked the same gene with binge-drinking. 

“We know of clear biological roles for a small number of human endogenous retroviruses. However, there has never before been strong evidence in support of a role in human biology of an endogenous retrovirus that is unfixed, in other words not shared by all individuals in the population. Our study shows for the first time that rare variants of HK2 can affect a complex human trait. The replication of this finding in the distinct Athens and Glasgow cohorts is particularly important,” Professor Katzourakis, from the University of Oxford, who co-directed the study said in a statement.

Although they haven’t established a causal relationship, the correlation identified in the study is strong. The authors suspect that HK2 may predispose a fraction of the population to addictive behavior.

Previously, studies have found a link between HERVs and autoimmune disorders, along with other harmful effects.

“Most people think these ancient viruses are harmless. From time to time, people have shown overexpression of HK2 in cancer, but it has been difficult to distinguish cause from effect. Back in 2012, following a 20-year controversy regarding their pathogenic roles in humans, we sought to test the high-risk hypothesis that HERVs can be responsible for human disease. Our proposal was supported by the Medical Research Council, and now we have strong proof that HERVs can be pathogenic. For the first time, we are able to make a distinction between cause and effect in HERV pathogenicity,” Dr Magiorkinis, from the University of Athens, who led the study added.

The new results may represent evidence for a physical cause of addiction. If that is indeed the case, then the way drug addiction is handled both in medical practice but also in society, where it is highly stigmatized, could be severely disrupted.

Whether discussing heroin, prescription drugs, marijuana, or synthetics, American drug abuse has reached alarming levels. In 2014, the National Institute on Drug Abuse (NIDA) reported that an estimated 24.6 million Americans over the age of 12 had used an illicit drug during the last month. This accounted for 9.4% of the demographic, which is an increase from 8.3% in 2002.

Many drug users are unable to get help because of the stigma attached to their addiction. A link between a genetic trait and addiction might lead to a revolution in the way drug addiction is viewed by the public. 

The next step is to find an actual mechanism by which HK2 manipulates the dopamine system in the brain. Understanding the inner workings of this potential molecular mechanism could also allow scientists to develop better treatments for drug dependence.

“Looking into this “dark” part of the genome will unlock more genomic secrets,” said Dr. Magiorkinis.

Scientific reference: Timokratis Karamitros, Tara Hurst, Emanuele Marchi, Eirini Karamichali, Urania Georgopoulou, Andreas Mentis, Joey Riepsaame, Audrey Lin, Dimitrios Paraskevis, Angelos Hatzakis, John McLauchlan, Aris Katzourakis, Gkikas Magiorkinis. Human Endogenous Retrovirus-K HML-2 integration within RASGRF2 is associated with intravenous drug abuse and modulates transcription in a cell-line modelProceedings of the National Academy of Sciences, 2018; 201811940.


California two-spot octopus (O. bimaculoides). Credit: Thomas Kleindinst.

MDMA or ‘ecstasy’ makes octopuses more social, too

People who take MDMA, a common recreational drug which is also known as Molly or ecstasy, feel a sensation of elation and the urge to connect with others. Now, a fascinating new study suggests that this applies to octopuses too, despite the fact that we’re separated by 500 million years of evolution.

California two-spot octopus (O. bimaculoides). Credit: Thomas Kleindinst.

California two-spot octopus (O. bimaculoides). Credit: Thomas Kleindinst.

MDMA acts by increasing the activity of three neurotransmitters in the central nervous system: serotonin, dopamine, and norepinephrine. The emotional and pro-social effects of MDMA are likely caused directly or indirectly by the release of large amounts of serotonin, which influences mood (as well as other functions such as appetite and sleep). Serotonin also triggers the release of the hormones oxytocin and vasopressin, which play important roles in love, trust, sexual arousal, and other social experiences.

Gul Dolen, an Assistant Professor of neuroscience at Johns Hopkins University, along with colleagues, studied the California two-spot octopus (Octopus bimaculoides), a species that is less challenging to work with in laboratory conditions. It’s also the only octopus to have its genome fully sequenced, allowing the researchers to make a gene-by-gene comparison with the human genome.

Researchers gave some octopuses a dose of MDMA and then studied their behavior. What they saw surprised them, considering the solitary nature of O. bimaculoides. Individuals under the influence of the drug spent more time with other octopuses, both male and female. The most striking behavior, however, was that they engaged in extensive ventral surface contact — in other words, they were very touchy-feely. The typically rare physical contact between the octopuses was non-violent and more exploratory in nature.

“Despite anatomical differences between octopus and human brain, we’ve shown that there are molecular similarities in the serotonin transporter gene,” Dolen said in a statement. “These molecular similarities are sufficient to enable MDMA to induce prosocial behaviors in octopuses.”

These findings show that O. bimaculoides share the same serotonin transporter gene with humans, which is known to serve as the principal binding site of MDMA. So it seems like this is an ancient neurotransmitter system shared across vertebrate and invertebrate species, which evolved hundreds of millions of years ago.

Of course, the serotonin system did not evolve to get creatures high but rather to enable complex social behaviors. For instance, the octopus may rely on this common pathway to behave socially during the mating season.

In the future, the researchers plan on sequencing the genomes of two other species of octopus, which are closely related to each other but differ in their behaviors. This way, they hope to gain more insight into the evolution of social behavior.

The findings appeared in the journal Current Biology.