Overview

Project Highlights:

  • Uniquely comprehensive multiomics dataset for the discovery of genetic interactions that matter for adaptive evolution.
  • A unique animal model system that provides a real-time observation of the evolutionary process.
  • A multidisciplinary learning environment that includes statistical genetics, ecology, evolution, comparative and functional genomics, bioinformatics and environmental sciences.

 

Overview:

A unique study system (the micro-crustacean Daphnia) is used to identify the mechanisms of adaptation to environmental change. These include epigenetics enabling phenotypic plasticity that may precede and inform the genetic fixation of adaptive traits. The environmental genomics model organism Daphnia produces diapausing embryos as part of its reproductive cycle. When buried in the lake sediments, these embryos produce a living archive of past populations that can be sampled and resuscitated in the laboratory after prolonged periods of time. Investigators at the University of Birmingham have hatched these dormant embryos to resume development, producing healthy reproductive adults, even after 700 years of sustained developmental arrest. From these hatchlings, populations of Daphnia are indefinitely maintained in the laboratory for evolutionary studies through 10,000 generations. Using an “evolutionary genetic” panel consisting of Daphnia populations resuscitated from a lake sediment spanning 100 years of ecological change, the student will apply quantitative genetic theory to detect the genetic basis for plasticity in gene expression and functional gene-gene associations under multiple environmental conditions. Transcriptional and epigenetic quantitative traits are a promising frontier in Daphnia. In other species, this work reveals that variation in the abundance of gene transcripts is an important class of quantitative traits, and that a considerable fraction of gene expression is heritable. There is increasing evidence that quantitative trait loci (QTLs) associated with phenotypes of interest are more likely to be expression QTLs (eQTLs) than allelic variation for other genomic elements at equal frequencies. Most recently, a new class of QTLs called variance QTLs (or veQTLs) has been discovered, which are genetic determinants of observed variability in gene expression. veQTLs are revealed by conducting experiments capable of partitioning variance among individuals (e.g. recombinant inbred lines, or Daphnia clonal isolates) that have the same or different genomic backgrounds. Detection of such veQTLs is important to understand adaptive evolution, because differences in veQTLs at a locus can either result from epistasis among genetically interacting loci, from genotype-by-environment interactions, or both. In all cases, these interactions produce context-dependent effects, and may themselves be targets of natural selection contributing to the adaptive potential of natural populations.

Figure 1: Co-responsive Elements for Adaption Revealed. The student will have identified biologically important interactions between genomic elements at multiple scales, based on their expression along sets of sampled time intervals. At the level of the regulation of individual genes, the student will learn about the interactions between enhancers, promoters, and other functional elements that are important to modulate gene expression in response to lake eutrophication. At network-scales, the student will discover epistatic relationships between genes that govern genotype-specific responses and associated changes in metabolisms correlated with transcriptional shifts.

Methodology

The student will be given a uniquely large and diverse dataset to investigate the genetic basis of adaptive traits. RNA, and metabolites are functional links between variation at the DNA and phenotypic levels. To understand the genetic basis of transcriptome and metabolome diversity, the student will identify expression QTLs for the mean (eQTLs) and variance (veQTLs) of expression to discover genetic interactions producing context-dependent phenotypic effects.

To characterize the genetic architecture of quantitative variation in gene and metabolic expression, the student will conduct a genome-wide association study (GWAS) to map expression QTLs (mean eQTLs) that regulate mean expression for each gene expression trait.

The final step will integrate the multiomics data types and genetic elements into co-response networks relevant to adaptive change. We hypothesize that most adaptations to the environmental challenge are governed by complex, possibly non-linear networks of interacting genes that affect the rates of adaptive evolution.

Training and Skills

The student will be trained in modern statistical genetic techniques that include genome-wide association study (GWAS), QTL, expression (e)QTL, and variance (ve)QTL analyses to characterize the genetic architecture of quantitative variation in the transcriptome, epigenome and metabolome responding to historical environmental challenges to a natural population. The student will become an expert in advanced machine or deep learning computational approaches to construct co-responsive regulatory networks and detect potentially higher order interactions effectively called “network Quantitative Trait Loci” (nQTLs), where the impact of a genetically or epigenetically regulated allele may depend on the state of larger networks.

Timeline

Year 1: Measure transcriptome and metabolome diversity. To understand the genetic basis of transcriptome and metabolome diversity, the student will identify expression QTLs for the mean (eQTLs) and variance (veQTLs) of expression. These observed differences in veQTLs at a locus will be a first indication of genetic interactions producing context-dependent effects. The results will indicate to what degree the observed plasticity in gene expression and metabolism is genetically determined.

Given the availability of genome sequences, the student will also calculate the genetic covariance (or genetic similarity) among the isolates of the evolutionary genetics panel, assuming an infinitesimal model. This will provide an estimate of how much variation in expression is heritable, by calculating the broad sense heritability (H2) for each expression trait as the proportion of total variance explained by between-isolate differences. The results will indicate to what degree the observed plasticity in gene expression and metabolism is heritable while also indicating the predominant type and degree of genetic interactions.

Year 2: Discover QTLs associated with mean abundance and variance of expression. The student will conduct a genome-wide association study (GWAS) to map expression QTLs (mean eQTLs) that regulate mean expression for each gene expression trait, likely by fitting linear mixed models to multiple significant principle components of the genotypes, and estimate isolate means for each genetically variable transcript and metabolic peak using best linear unbiased prediction. Integrate epigenetic, transcriptional, and metabolomics data to identify networks. The student will then will use transcriptomics and metabolomics data to discover molecular responses to each treatment corresponding to the particular fitness-related trait. The student will exploit observations across the isolates to identify networks of co-regulated genes and co-modulated genomic regions using bi-clustering techniques followed by regression analysis using techniques like Random Forests and Deep Learning. The results will be the discovery of cohorts of genomic regions and components of metabolic and transcriptomic profiles that exhibit strong covariation thereby forming networks and pathways.

Year 3: Predict candidate networks and pathways relevant to adaptive change. The student will detect candidate networks underlying adaptive changes, requiring a flexible, yet computationally efficient modelling strategy. Sparse tensor decompositions may be useful in this setting, since low-rank decompositions constitute special types of latent variable models, and hence can be used to parameterize the transcriptional and metabolomics responses across genetic backgrounds and environmental conditions. The student may then learn epigenome-gene, gene-gene and gene-environment interactions of potentially higher order. To complement molecular measurements, the student will have access to quantitative adaptive phenotypic endpoint data, which may also correlate with selected features in a QTL to reveal network-level QTLs (nQTLs) under natural selection. The student will build a broad catalogue of nQTLs in Daphnia. By integrating these results with the insights gained from the year 2 results, there will be a set of testable hypotheses that link individual elements of the genome (genes, enhancers and other chromatin elements) to genome-wide transcriptional responses to environmental challenges.

The ambitious proposed work is made possible over 3-4 years by the ongoing resources and data developed by NATURAL ENVIRONMENT RESEARCH COUNCIL, highlight topic: Breaking the code of adaptive evolution, which ends in 2020.

Partners and collaboration (including CASE)

This project is closely associated with other research at the University of Birmingham and beyond that focuses on the study of diapausing Daphnia populations buried in lake sediments so to uniquely witness the molecular mechanisms of adaptive evolution in natural populations over decades and centuries against the documented environmental changes that have occurred from local to regional scales (called resurrection biology). As such, partners and collaborators include the Natural History Museum (ancient DNA research), the UK MetOffice (climate change), Natural England and the Environment Agency (biodiversity in the face of pollutants).

Further Details

Any questions about the project can be directed to:

Professor John Colbourne

Chair of Environmental Genomics

School of Biosciences,

University of Birmingham

Email: J.K.Colbourne@bham.ac.uk