investigator_user investigator user funding collaborators pending menu bell message arrow_up arrow_down filter layers globe marker add arrow close download edit facebook info linkedin minus plus save share search sort twitter remove user-plus user-minus
  • Project leads
  • Collaborators

From DNA Sequence to Expression: a Quantitative Approach from Yeast to Human

Eran Segal

1 Collaborator(s)

Funding source

National Cancer Institute (NIH)
The ability to control the activity level of different genes is key to fundamental biological processes such as development and differentiation, and many human diseases are caused by defects in this regulatory process. This regulation is encoded within specific regions of the genome, termed regulatory regions, and indeed, in many studies of cancer and of other diseases and human phenotypes, changes in gene activity that are tightly linked to the disease state have in turn been linked to changes in the DNA sequence of the genes' regulatory regions. However, we currently have a poor understanding of the how gene activity is encoded by DNA sequence, and thus, we do not understand by what mechanism these disease-linked sequence changes cause the observed changes in gene activities. Given the many studies of gene regulation that have been carried out, it is actually surprising how little we know about this mapping between gene activity and DNA sequence. In principle, such questions can be directly answered through accurate measurements of regulatory regions in which various sequence elements are varied systematically. However, such data does not currently exist, most likely due to the technical difficulties in constructing such sequences and accurately measuring their activity. Here, we aim to derive a mechanistic understanding of how gene activity patterns are encoded in DNA sequence, and arrive at a quantitative model that describes the entire process, from the activity of the regulating proteins, termed transcription factors, to their binding to regulatory regions, through the important role of DNA packaging in this process, and up to the gene activity patterns resulting from the DNA binding activity of the regulating transcription factors. A systematic study of such interactions requires the ability to efficiently synthesize and accurately measure the activity of many different regulatory sequences. We have recently developed such capabilities, which we will utilize in this project. Specifically, we will design regulatory sequences that systematically test the quantitative contribution of various types of sequence elements to gene activity, measure their activity, integrate the resulting data into a unified model of gene regulation, and then use this model to examine how such regulatory sequence elements are used in native promoters to achieve biologically meaningful activity patterns, and how changes in these sequence elements during evolution contribute to evolutionary changes in gene activity. Finally, we will apply the model to predict gene activity changes among human individuals, using the emerging genotype data that is rapidly being collected. If successful, our project should have far reaching implications. Most notably, since changes in gene activity levels play a key role in the development of cancer and of many other diseases, even a partial ability to predict gene activity changes among human individuals from the genotype information that is rapidly being collected for them, could have important medical implications.

Related projects