Statistics and Data Science

Listed on this page are current research projects being offered for the Vacation Scholarship Program.

For more information on this research group see: Statistics

Using generalised additive models to model timeseries data

Statistical models are often used to characterise and predict temporal trends in quantities of interest. This includes, for example, modelling the changing patterns of human mobility and contacts during the COVID-19 pandemic to understand the risks of transmissions and the impact of interventions, such as lockdowns. Regression models with splines and smoothing terms, such as generalised additive models (GAMs), are useful for this application, because of their flexibility in capturing unpredictable temporal trends, and the ease of implementing step-change parameters. However, real-life timeseries data are almost always limited in terms of frequency of sampling, and in terms of providing unbiased and representative samples of the population of interest. It is important to thoroughly test how models behave when fitted with challenging data, so that we can use the models most appropriately.

In this project, we will fit GAMs designed for modelling timeseries data to a variety of simulated timeseries datasets of human contact patterns. We will simulate data with specific issues, such as biased and missing data, and we will explore model behaviour with these data in terms of predictive accuracy and ability to capture key characteristics of the timeseries. The learnings of this project will inform research software design with real-life applications.

Contact: Jennifer Flegg  jennifer.flegg@unimelb.edu.au

Applying deep learning to problems in genetic epidemiology

In phylogenetics, we use genomic data from pathogens to study infectious disease. In this project the student will investigate using neural networks to tackle computational problems in phylogenetics.

Contact: Alex Zarebski azarebski@unimelb.edu.au

A portrait of intercellular communication in Waddington’s landscape

Waddington’s epigenetic landscape is an illustrative metaphor proposed by the biologist C.H. Waddington in the mid-20th century to describe cell development. The metaphor suggests that cell development is analogous to a marble rolling down a hill. As a marble will descend down a hill until eventually coming to rest in a (local) valley, so too will a cell develop along trajectories of an epigenetic landscape until it has become a fully differentiated (or ‘developed’) cell. The features (peaks and troughs) of the epigenetic landscape are determined by the gene expression of the cell. Traditional mathematical models of the Waddington landscape used a deterministic approach that can only feasibly be applied to low-dimensional gene regulatory networks that are known in advance. These models did not account for stochasticity, nor the influence of intercellular communication on gene expression — both of which are crucial for determining a cell’s future state, or fate. To address these limitations (and more!), this project will use a statistical mechanics approach to describe Waddington’s epigenetic landscape.

Contact: Michael Stumpf mstumpf@unimelb.edu.au

Hypergraph Animal Decomposition of Complex Networks

Hypergraph animals are small sub-networks which capture the local neighbourhoods of vertices in complex hypergraphs.The combine aspects of classical lattice animals and network motifs. We understand their combinatorial properties and their frequency spectra in random hypergraphs, and a next step in their analysis is to study their frequency spectra in real-world hypergraphs. Determining empirical distributions of hypergraph animals (so-called hypergraph zoos) and comparing them between different types of networks/hypergraphs will allow us to distill their functional relevance. This project will compare hypergraph zoos corresponding to metabolic and biochemical reaction networks in different species, in order to explore the functional role of hypergraph animals in real systems.

Contact: Michael Stumpf mstumpf@unimelb.edu.au