Stochastic modelling

As a member of the SynthSys - Centre for Synthetic and Systems Biology, I apply stochastic modelling techiniques to biological systems. A detailed kinetic model of RNA polymerase II C-terminal domain phosphorylation is currently being defined in the Kappa language. More information about Kappa can be found on the Danos group webpages.

Transcription and splicing

I have been investigating noise in transcription in order to explain the large variability in copies of mRNA/cell observed in single cells. The top right figure shows a typical data set (Bertrand Lab, IGMM, CNRS, Montpellier) and corresponding prediction of a stochastic model in which the gene switches state from on to off, and back.
In collaboration with the Beggs Lab, the kinetics of transcription and splicing in the Ribo1 yeast reporter has been investigated by modelling transcription initiation, elongation of the pre-mRNA, and the two steps of the splicing reaction.
Splicing removes non-coding introns from the messenger RNA prior to the translation of the genetic information into protein. Splicing is an important process that is responsible for increasing the diversity of proteins that can be read from a DNA template [see figure middle right].
The kinetics of splicing in this yeast reporter can be studied due to the development of quantitative RT PCR assays. The copies/cell of unspliced, intermediate and mature (intron removed) RNA in a population of cells constitutes the data to be modelled. A detailed stochastic model of splicing has been developed, and fitted to the data by simulated annealing [see figure bottom right]. Modelling shows that that splicing predominantly occurs while the RNA transcript is being synthesised, in contrast with many earlier suggestions in the literature.

Model optimisation and comparison

Finding the parameter values that optimise the fit of a model to the data, and comparing alternative models, are two linked problems that follow from systems modelling.
Finding the optimal parameters is typically a complex problem due to the size of the parameter space. The problem is compounded where no analytic solution to the chemical master equation is known, and simulation of the model is required in order to estimate goodness of fit. Parameters in systems models are often unidentifiable due to the structure of the model and/or due to the available data. Optimisation strategies currently being used for the problems described above include: In addition to finding good combinations of parameter values, the standard deviation or distribution of the parameter is also of great importance in order to determine confidence limits.

Nested Sampling

Nested sampling is a Bayesian approach to the calculation of the evidence integral (Z) for a model, given a data set, and therefore allows models to be compared. Nested sampling produces samples from the posterior as a by-product. The mean and standard deviation of the model parameters can be computed as a result. Nested sampling solves two key problems in systems modelling and is being further developed in a new BBSRC project BB/I023461/1. This project is a collaboration with Prof Andrew Millar (U. Edinburgh) and Dr Ozgur Akman (U. Exeter).