ENCODE project salvages 'junk DNA'

Researches from the ENCODE project—some of whom are Duke faculty—have found that some DNA strands previously thought to be “junk DNA” actually contain a significant amount of information.
Researches from the ENCODE project—some of whom are Duke faculty—have found that some DNA strands previously thought to be “junk DNA” actually contain a significant amount of information.

After nearly a decade of research, a group of geneticists at Duke and around the world have reported that a significant amount of information exists in swaths of the human genome that were previously referred to as junk DNA.

Researchers from the Encyclopedia of DNA Elements project—known as ENCODE—published numerous papers last week that collectively assign some functionality to approximately 80 percent of the human genome. Previously, the only portions of the genome that were recognized and understood were those catalogued by the Human Genome Project, which released a list of more than 20,000 genes that code for the creation of various proteins. Those sections of the genome make up less than 2 percent of the genome’s roughly three billion nucleotides—the purpose of the rest was unknown.

Now, however, ENCODE researchers have ascribed functions to a much greater portion of the genetic code.

“The ENCODE project really teaches us to embrace our own ignorance,” said Huntington Willard, director of the Duke Institute for Genome Sciences and Policy. “The original ENCODE project was designed to say that this stuff couldn’t just be there doing nothing.”

Because the cost of researching unfamiliar portions of the genome is so expensive, it was more economical to sweep these less recognizable nucleotide sequences under the rug or dismiss them as “junk DNA,” Willard said.

The ENCODE researchers, however, remained curious about the massive genetic unknowns. The ENCODE project began in 2003 as an effort to fund research proposals from biologists to analyze and assign meaning to these unknown sections. Over time, the program’s researchers, among them Duke’s Greg Crawford, assistant professor affiliated with the IGSP, refined their methods and began to uncover hidden relationships between separate sections of the genetic code.

Crawford could not be reached for comment in time for publication.

By 2007, the project’s pilot phase had proven so successful that the researchers began to apply their methods to the entire genome. Now, the team has gained understanding of gene interactions and how different sections of nucleotides influence one another, said Robert Cook-Deegan, IGSP director for genome ethics, law and policy.

“This really was the kitchen sink project for genomics,” he said.

The 440 scientists of the ENCODE project have released data on at least 147 different cell types across 1,648 experiments, many of which indicate unexpected links between small sections of DNA and disease. The studies have indicated correlations between certain changes to the genetic code and an individual’s susceptibility to various kinds of disease—a few nucleotides out of place can create an immune disorder or a disease.

The studies also found that sections of DNA that do not generate protein-coding RNA can still be transcribed into important regulatory compounds that affect signaling pathways within the cell.

“The ENCODE project begins to shine some light on where we might begin to look for the areas in which genome variation or genome changes might underlie gene dysfunction and thus disease, beyond the mere two percent of the genome that everyone studied before,” Willard said.

The program has also identified over 70,000 promoter regions that activate sections of DNA for transcription, as well as over 400,000 enhancer sections that affect the expression of other portions of the genome.

Although Cook-Deegan noted that many of these regions’ assigned functions are “best guesses,” based on the limited available data, he said that the ENCODE project’s findings should make further discoveries much easier.

“We can now get these tools into the hands of as many geneticists as possible to start figuring out how these things work together in terms of gene expression,” he said.

The ENCODE project has discovered many new roles played by the genome, but their findings have given rise to many more questions than answers. Cook-Deegan said that although the project has located a number of anecdotal links, we are very far from completely understanding how genes interact with one another.

“There’s a much richer story to be told here, but it will take a long time for us to learn how to understand it,” he said.

Discussion

Share and discuss “ENCODE project salvages 'junk DNA'” on social media.