Q&A with Tim Reddy

Biostatistics professor Tim Reddy is working on the ENCODE project, which aims to sequence and decode the human genome.
Biostatistics professor Tim Reddy is working on the ENCODE project, which aims to sequence and decode the human genome.

Tim Reddy, an assistant professor of biostatistics and bioinformatics at Duke, has been involved in the ENCODE project since 2008. The ENCODE project aims to sequence and decode the human genome in depth, particularly regions containing genes that were previously known as “junk DNA.” Reddy’s research focuses on the regulatory mechanisms within the genome. The Chronicle’s Ryan Zhang sat down with Reddy to discuss the discoveries made by the ENCODE project and the future of the field.

The Chronicle: Can you tell us about the ENCODE project?

Tim Reddy: We sequenced the human genomes around 2003, and that basically gave us the entire sequence of DNA that determines what makes a person a human. But we didn’t really understand what the vast majority of that sequence did. We had a good sense of where the genes were, and it was fairly straightforward—at least at first—to map out those genes. But that only covered about 2.5 percent of the human genome, which left about 97.5 percent of the genome undocumented.... The goal of the ENCODE project was to, in a very comprehensive and organized fashion, start to decode what’s going on in between the genes and try to give some sort of regulatory or functional annotation to that other 97.5 percent of the genome. Basically we’re taking that first sequencing of the human genome and trying to overlay a lot more functional information on top of it.

TC: How did you get involved with the project and how long have you been working on it?

TR: I started working on the project when I started my [postdoctoral research] in 2008. When we first started the project, we were working heavily on just getting production and analysis pipelines up and running. One of the features of ENCODE is that it’s a lot of high-throughput assays all performed on a common set of cell types.

Given the number of assays we were running, it was really important to ensure that we could limit the variability between the assays. To make that data as high-quality as possible took a couple of years. Once that was established, I spent the next few years working on some specific analysis projects. What I’ve been particularly interested in is how variation in the human population actually impacts which regulatory elements are used between different people. In the last few years, I’ve done a lot of work studying how the differences can be looked at from a functional point of view.

TC: What is your role in the project?

TR: We look at something that has been studied for a long time: allele-specific activity.... When you look at a gene that’s expressed, every person has, for the most part, two copies of every gene—one that came from the mother and one that came from the father. One of the things that we can do with the data produced by ENCODE is that we can actually distinguish the maternal copy from the paternal copy in gene expression in a single individual. What that allowed us to say was that, for some gene that a certain individual is expressing, most of that comes from the maternal copy and less from the paternal copy, or vice versa. Once we did that, we could also map some of the regulatory elements in the same way. For example, a regulatory element could be very active on the mom’s chromosome but not very active on the dad’s chromosome. By bringing that together, we can start to try to understand how variation in the regulatory region that is inherited from your parents leads to variation in gene expression as well.

TC: Why weren’t the sequences looked at by the ENCODE project looked at by the Human Genome Project? In other words, what made the researchers of the Human Genome Project pass over the “junk” DNA?

TR: Actually, they didn’t pass over it, not at all. You look at the most obvious problem first. In this case, the most obvious thing to look at were the genes that make the proteins in our body that drive all of our cells—that was effectively the low-hanging fruit to study. The regulatory regions of the genome have always been much more difficult to study and to do that in a comprehensive way has really only become technically possible these past few years. ENCODE has really capitalized on using this new technology to reveal a huge amount of the genome that we weren’t actually able to see before.

TC: How do you think the discoveries made by the ENCODE project affect the way we view genomic science and the role of DNA in humans?

TR: The biggest impact—and it has already started to happen a little bit—is going to be in understanding disease. One of the things that we’ll find when we look at various types of diseases, especially complex diseases such as diabetes and so on, is that there isn’t really a single mutation associated with the disease but actually many, many mutations. There have been human geneticists for decades now who have been trying to understand exactly what all of the variants across the human genome are that actually lead to or change your predisposition toward getting these diseases. There’s a lot of evidence that these mutations are all in regulatory regions. Now that ENCODE has started to map and identify these regulatory regions, presumably we can beter understand some of the mechanisms underlying these genetic biases for diseases. Hopefully this can later translate into treatment.

TC: You mentioned the development of future treatments as a possible application of these findings. What other practical applications do these findings have?

TR: Medical treatment is one of the major ones. But you can also do some fun things, too. You can try to understand differences that are relatively benign, such as changes in hair color or changes in eye color. A whole slew of human phenotypes can be studied now in new ways that simply were not possible a couple of years ago.

TC: Why do you think it is important to continue studying the genome even with all the discoveries that have been made over the past decade?

TR: ENCODE is great in what it’s done—it’s definitely opened a lot of new doors for study. But to say that we truly understand the genome, we’re a long ways off from that still. Here’s an example: If you read through the paper, or read the abstract, you’ll see that 80 percent of the genome has some sort of functional purpose. But to really say what fraction of the genome is controlling specific genes, there’s still a whole lot to be understood there. Taking the next step and actually understanding the impact on gene regulation remains a huge area that needs to be studied.

TC: What do you think should be the next goal for scientists researching the genome?

TR: There are different approaches to different things, but for ENCODE, there are two main avenues that remain open. One is to look at primary tissues. ENCODE looked almost exclusively at cells grown in cell cultures. Those are very effective as models because they allow you to perform a lot of interesting experimental techniques that are very difficult to do if you’re trying to get skin from a person, where samples are limited. As these techniques are understood better, we can start to apply them to primary tissues—tissues that actually cause disease—and understand how these disease-causing tissues are actually contributing to whatever’s going on.

And then the other important thing is to now connect the regulatory elements to changes in how genes are used. If we can try and do those two things, they’ll keep us busy for another five or six years.

Discussion

Share and discuss “Q&A with Tim Reddy” on social media.