Prospective projects for all students at BSc,
Observations that vary in both space and time are called spatio-temporal data. Example data set includes air pollution,
precipitation, temperature, disease specific (Covid-19) case and death rates, brain imaging,
ocean characteristics such as temperature, salinity and chlorophyll levels.
Data science techniques are to be used to extract the scientific information, e.g. long term trend in global
warming, hidden in these large data sets. Example of data science techniques include regression modelling and validation methods.
Intuitively one can expect that spatio-temporal regression models that exploit the spatio-temporal dependence in the data
will perform better than regression models with iid error distribution assumption.
If you are interested in:
- environmental statistics you will have the opportunity to learn modelling of large climate and air pollution data sets. You can aim to `go green' with Prof Sahu!
A third year Mathematics BSc student, Ms Jinran Zhan, worked on a very similar project in 2018-2019 and
based on the project it has been possible to publish the research paper: Spatio-temporal Bayesian modeling of precipitation using rain gauge data from the Hubbard Brook Experimental Forest, New Hampshire, USA.
- data science you will learn spatio-temporal regression models
that outperfom independent error regression model in out of sample validation.
This is indeed true for most data sets and you will have the opportunity to experience these results yourself.
You will have ample chance to learn to swim with data.
Witaya Bamrungpong, a student in our MSc in
Data & Decision Analytics worked on data science
project on air pollution modelling which secured him the top prize from Boeing for 19/20 CORMSIS MSc Maths OR students.
- medical statistics, you will learn to perform Bayesian disease mapping for analysing live coronavirus pandemic data. A research paper jointly written with Prof Dankmar Boehning is also available.
If you are thinking of studying for PhD please email me. You may also want to see my research publication list and supervision record.
Getting started with each project is easy with the R-package bmstdr developed by Prof Sahu. The projects will also benefit from an accessible textbook currently being written by Prof Sahu on the same topic.
The projects will suit students with a wide range of interests in theory and application at all levels:
- A mathematically strong and motivated student can develop the theory
behind the modelling so that new models can be fitted.
- A student with interests in data analytics and data science can
analyse a brand new spatio-temporal data set of their choice.
- A student aiming to gain key skills in R programming can develop and enhance the bmstdr package.
- It is possible to mix and match the above, i.e. theoretical development, application analytics and software development,
depending on your own interest and dedication.
Practical examples from projects done by past students:
Number of Covid-19 deaths per million people upto September, 2020.
Annual percentage trend in ocean chlorophyll levels.
Average number of weekly covid deaths and levels of NO2 in England
Annual average temperature in the north Atlantic in 2003 and
average air pollution in New York.
Annual precipitation and trend map of Hubbard Brook experimental forest in New Hampshire, USA
Air pollution modelling maps for eastern USA
Air pollution and their sd map of England and Wales.