Stochastic modelling
As a member of the SynthSys - Centre for Synthetic and Systems Biology, I apply stochastic modelling techiniques to biological systems. A detailed kinetic model of RNA polymerase II C-terminal domain phosphorylation is currently being defined in the Kappa language. More information about Kappa can be found on the Danos group webpages.
Transcription and splicing
I have been investigating noise in transcription in order
to explain the large variability in copies of
mRNA/cell observed in single cells. The top right figure shows a
typical data set (Bertrand Lab, IGMM, CNRS, Montpellier)
and corresponding prediction of a stochastic model in
which the gene switches state from on to off, and back.
In collaboration with the
Beggs Lab, the kinetics of transcription and splicing
in the Ribo1 yeast reporter has been investigated by modelling transcription
initiation, elongation of the pre-mRNA, and the two steps
of the splicing reaction.
Splicing removes non-coding
introns from the messenger RNA prior to the translation of
the genetic information into protein. Splicing is an
important process that is responsible for increasing the
diversity of proteins that can be read from a DNA template
[see figure middle right].
The kinetics of splicing in this yeast reporter can be studied due to the
development of quantitative RT PCR assays.
The copies/cell of unspliced, intermediate and mature (intron removed) RNA
in a population of cells constitutes the data to be
modelled. A detailed stochastic model of splicing has been
developed, and fitted to the data by simulated annealing
[see figure bottom right]. Modelling shows that
that splicing predominantly occurs while the RNA transcript is being
synthesised, in contrast with many earlier suggestions in
the literature.
Model optimisation and comparison
Finding the parameter values that optimise the fit of a model to the data, and comparing alternative models, are two linked problems that follow from systems modelling. Finding the optimal parameters is typically a complex problem due to the size of the parameter space. The problem is compounded where no analytic solution to the chemical master equation is known, and simulation of the model is required in order to estimate goodness of fit. Parameters in systems models are often unidentifiable due to the structure of the model and/or due to the available data. Optimisation strategies currently being used for the problems described above include:- parameter sweep and local search
- simulated annealing
- Nested Sampling Poster at MLSB 2010
- Expectation Maximisation
Nested Sampling
Nested sampling is a Bayesian approach to the calculation of the evidence integral (Z) for a model, given a data set, and therefore allows models to be compared. Nested sampling produces samples from the posterior as a by-product. The mean and standard deviation of the model parameters can be computed as a result. Nested sampling solves two key problems in systems modelling and is being further developed in a new BBSRC project BB/I023461/1. This project is a collaboration with Prof Andrew Millar (U. Edinburgh) and Dr Ozgur Akman (U. Exeter).