Impact case study database
Multilevel modelling research, software and training builds statistical capacity to inform national and international public policy development and improve the accuracy of the information base
1. Summary of the impact
Complex multilevel datasets are a challenge to analyse, especially for non-specialists. The University of Bristol’s Centre for Multilevel Modelling (CMM) carries out cutting-edge statistical methodological research to address these challenges, incorporating its findings into statistical software, face-to-face training and training materials. This research, training and software has increased the capacity of non-academic, and non-specialist beneficiaries to understand and apply these techniques in their work and has significantly increased the number of highly trained non-academic users nationally and internationally. Since August 2013, the popular MLwiN multilevel modelling software has been purchased by 193 non-academics users internationally, including the UK Home Office, where bespoke training led by CMM researchers enabled multilevel modelling techniques to be applied to crime and policing policy. Additional software packages, as well as online training tools and resources, have supported professionals across many public sectors to apply multilevel modelling and thereby improve the accuracy of information for evidence-based policy making.
2. Underpinning research
Many kinds of data have a hierarchical, nested or clustered structure, in which data points are grouped at different levels. These groupings influence the response variable of interest and should be accounted for to avoid misleading outcomes. Browne and Goldstein are methodological statisticians responsible for landmark papers in the methods for fitting multilevel models in classical (Goldstein, 1986,1989), and Bayesian (e.g. Browne and Draper, 2000, 2006; Browne, Goldstein and Rasbash, 2001) frameworks. Much of Goldstein’s work is summarised in his seminal book [1], which is widely used across social, medical and biological sciences.
Since 2004, the Centre for Multilevel Modelling (CMM) has developed new methodology to fit statistical models that account for the complex data structures that exist in many applied problems found in educational and social data. The Centre consists of core and grant funded academics and was directed by Rasbash from 2004 to his death in 2010 and has since been co-directed by Browne (with Steele prior to 2013 and Leckie since 2015).
The CMM team carry out methodological research motivated by educational examples. A recurrent feature of educational data is the clustering of data points (i.e. pupils) at different hierarchical scales (e.g. class, school). Innovative methodological research has advanced the field and extended multilevel models in various ways, including, for example, to account for spatially correlated school effects in studies of student outcomes [2]; modelling variability in multilevel models as illustrated by models for social and ethnic segregation in London schools [3]; and the problems of missing data and measurement errors in birth cohort studies and longitudinal surveys [4]. In each case a genuine modelling problem has been identified in an education application and new methodology developed to address it.
A distinct aspect of CMM’s work is the translation of these (computationally intensive) methodology developments into user-friendly statistical software packages; MLwiN [5], Stat-JR, Realcom and MLPowSim [6] to allow direct access to new methods for applied researchers. The programming team, currently led by Charlton, continues to incorporate feature developments based on new methodology work (e.g. Browne et al. 2009, Browne, 2012, Goldstein, 2011, [3, 4, 5]). These packages are accompanied by extensive user documentation [5, 6] and training disseminated via face-to-face user workshops and online training materials, including the Learning Environment for Multilevel Modelling (LEMMA) online multilevel modelling course developed as part of ESRC LEMMA grants [vii] which brings the methodology and software to a very wide audience.
New interfaces from the Stata and R statistical packages have been developed by Leckie, Charlton, Parker, Zhang and Browne to give easier access to the methodology for a wider user base and further research is on-going to develop the Stat-JR package. Browne, Golalizadeh and Parker have additionally developed the MLPowSim package [6] for multilevel power calculations, whilst in recent work, Browne, Charlton and Washbrook have added functionality to Stat-JR to allow the automation of training material generation [iii].
3. References to the research
[1] Goldstein H. (2011) Multilevel Statistical models (4th Ed.). Chichester: Wiley. [Citations: 12383] [Book available on request].
[2] Browne WJ & Goldstein H. (2010) MCMC sampling for a multilevel model with non-independent residuals within and between cluster units. Journal of Educational and Behavioural Statistics, 35(4), 453-473. DOI: 10.3102%2F1076998609359788 [Citations: 36]
[3] Leckie G, French R, Charlton C, Browne WJ. (2014) Modelling heterogeneous variance-covariance components in two-level models. Journal of Educational and Behavioural Statistics, 39(5), 307-332. DOI: 10.3102%2F1076998614546494 [Citations: 54]
[4] Goldstein H, Carpenter JR, Browne WJ. (2014) Fitting multilevel multivariate models with missing data in responses and covariates that may include interactions and non-linear terms. Journal of the Royal Statistical Society: Series A. 177(2), 553-564. DOI: 10.1111/rssa.12022 [Citations: 68]
[5] Rasbash J, Steele F, Browne WJ, Goldstein H. (2017) A User’s Guide to MLwiN v3.01. Centre for Multilevel Modelling, University of Bristol. Available at: http://www.bristol.ac.uk/cmm/media/software/mlwin/downloads/manuals/3-01/manual-web.pdf [Citations: 4,309]
[6] Browne WJ, Golalizadeh Lahi M, Parker RMA. (2009) A guide to sample size calculations for random effect models via simulation and the MLPowSim software package. University of Bristol. Available from http://www.bristol.ac.uk/media-library/sites/cmm/migrated/documents/mlpowsim-manual.pdf [Citations: 73]
NB: All citations taken from Google Scholar as at July 2020.
Key Research grants:
[i] Browne WJ. et al. (2018-2019) Borrowing Strength – a collaborative software development for Small Area Estimation, ESRC: GBP99,000
[ii] Browne WJ, Leckie G & Goldstein H. (2017) Consultancy project on crime data, Home Office: GBP45,000
[iii] Browne WJ. et al. (2016-2018) Using Statistical E-books to teach undergraduate students quantitative methods and statistical software, British Academy: GBP115,000
[iv] Browne WJ. et al. (2013-2017) The use of interactive electronic-books in the teaching and application of modern quantitative methods in the social sciences, ESRC: GBP786,000
[v] Browne WJ. et al. (2009-2012) e-STAT –NCeSS quantitative node, ESRC: GBP1,100,000
[vi] Browne WJ. (2006-2009). Sample Size, Identifiability and MCMC Efficiency in Complex Random Effect Models, ESRC: GBP174,000
[vii] a) Rasbash J. et al. (2005-2008) Learning Environment for Multilevel Modelling Applications (LEMMA), ESRC: GBP650,000
b) Rasbash J. et al. (2008-2011) STRUCTURES for Building, Learning, Applying and Computing Statistical Models (LEMMA II), ESRC: GBP1,200,000
c) Steele F. et al. (2011-2013) Longitudinal Effects, Multilevel Modelling and Applications ( LEMMA III), ESRC: GBP1,400,000
[vii] Goldstein H. et al. (2003-2005) Developing Multilevel Models for Realistically Complex Social Science Data, ESRC: GBP300,000
4. Details of the impact
Our research, software and training have achieved impact in terms of capacity building of advanced statistical skills in non-academic and non-specialist users, both nationally and internationally, as well as in directly improving the information underlying evidence-based policy making in the UK public sector and Government.
Building capacity of professional statisticians and quantitative researchers to apply multilevel modelling techniques in their work
Our flagship MLwiN software [5] package now has over 15,000 users, and since 1st August 2013 it has been purchased by a further 193 new non-academic organisations [A], including the UK Home Office [F] and US Food and Drug Administration Office of Acquisitions and Grants Service. MLwiN is part of a suite of software packages including Stat-JR, Realcom and MLPowSim which are either supplied with MLwiN or as freeware. The linked on-line training course Learning Environment for Multilevel Modelling (LEMMA) contains 15 graduated modules starting from an introduction to quantitative research and progressing to multilevel modelling of continuous, binary, ordinal, and nominal data. There are now over 37,000 users of LEMMA with over 25,000 new users since 1 August 2013, 80% of who are non-UK based. Since August 2013 there are over 10,000 new non-academic users [B].
Since 2008 we have led annual courses with the Royal Statistical Society in London, with five courses between August 2013 and 2018, during which time we have trained 79 participants of whom 21 were government statisticians and 11 industry statisticians, including participants from the Swedish National Agency for Education, the UK Government Department for Health, Google and Virgin Media [C]. Each year, the course has received excellent feedback with an average score of 9.4 out of 10 agreeing the course will benefit their professional work [C]. Following one of these courses the Eurofound European Agency in Dublin requested a bespoke course which supported them to “[use] a multilevel approach to analysis of employment status in Europe which allows it to separate micro-/workplace-level and contextual effects. The conclusions of this research contribute towards the on-going EU policy debate on job quality and non-standard employment.” [C]. Both this example, and feedback from Ofsted [D], point to the importance of this robust analysis to inform policy makers; ‘the training course was very helpful for us in building our skills in data analysis we undertake on a regular basis which seeks to improve the overall quality of education and training in the UK, and inform policy makers about the effectiveness’ [D].
Our software packages (MLwiN and Stat-JR) are also used for training courses delivered by other researchers around the world. Since 2013 there have been over 30 courses open to non-academic participants, including 19 through the National Centre for Research Methods in the UK, and courses run in Australia, Canada and Germany. As one example, over past three years Ian Dohoo from Canada has presented three courses to 67 participants using MLwiN, open to professional non-academic participants, including 20 participants at a course in University of Queensland, Australia in 2018 [Ei, Eii].
Our latest software development, funded by the British Academy [iii], is new functionality in our StatJR software to automate the creation of training materials. This has encouraged others to work with us, including in 2019 the UK Data Service (UKDS) who have used StatJR to produce online training materials using their national data resources. This has impacted directly on their own training strategy and they recognise that the materials produced will have significant impact on researchers in terms of learning basic statistics techniques using key datasets held by the UKDS. As the Director of User Support and Training for the UKDS, stated: *“The materials produced […] will make this dataset more visible to our wide community of existing researchers, help our training effort and attract new users to our resources.*” [J].
Informing educational and social policy development by UK and International Public Sector & Government Services
Since August 2013, our statistical research [2-5] and software [1, 6] has been cited in over 25 national and international government and NGO reports in the areas of education and children’s services, community services, welfare services, public health, energy and climate change and agriculture and defence, including by several UK government departments, the World Health Organisation, Public Health England, Defence Research and Development Canada, and Ofqual. This demonstrates our research has been applied extensively by many public sector and government researchers to analyse complex data sets and has informed policy decisions.
In 2017, the Home Office approached CMM to apply our expertise in multilevel modelling to analyse the geographical predictors of crime and incidents. This contributed to the development of a key policing policy and resulted in an extensive report (not currently in the public domain) informing national policing strategy [F]. This commission from the Home Office [ii] also involved training government statisticians to ensure the analysis can be replicated with each new year of data, as well as applying multilevel modelling techniques to future policing policy. The Head of Policing and Police Resources Team, Crime and Policing Analysis, Home Office confirmed that “ The capacity of our analytical team to undertake advanced multilevel statistical analysis has been markedly improved” and in addition “Our analysts have been able to apply these techniques to at least one other project, unconnected to the original commission.” [F].
Prior to 2014, we worked with the Higher Education Funding Council for England (HEFCE, now Office for Students) to support their use of MLwiN [1, 5] to address fair admission to university for different ethnic groups. In the current REF period, HEFCE asked Browne to assist with changes to the UniStats website used by all prospective university students to compare courses, utilising his sample size calculations work on the MLPowSim package [6]. Browne was approached in 2014 to help HEFCE with a consultation about reducing threshold levels below which data cannot be shown on the Unistat website (available until end of 2019, but now replaced by updated DiscoverUni). He was able to supply research-based statistical support to the proposal to reduce the threshold from 23 to 10 students, as well as indicating the need for more transparency in the methods and statistical uncertainty in the presentation of data [G, H]. These recommendations were implemented by HEFCE, with the Head of Research Analysis, Office for Students, stating: “These changes mean that not only do prospective students have more data available on more courses at more institutions, but they are also supplied with additional resources to aid them in understanding the data presented to them.” [G, H].
Finally, our research has been used to improve the accuracy of information underlying the evidence base for policy development. For example, our research on missing data [4], has been used by others to improve the statistical methodology employed to evaluate the effectiveness of at least three school-based randomised control trials carried out by the Education Endowment Foundation [I].
5. Sources to corroborate the impact
[A] MLwiN user database (2020). Download and sales figures report
[B] LEMMA training materials user database (2020). Registered users report
[C] i) Royal Statistical Society (2018). Executive Director - Factual statement
ii) Eurofound (2020). Research Officer - Factual statements
[D] Ofsted (2020). Factual statement
[E] University of Queensland, Australia (2018). i) email from MLwiN training provider and ii) MLwiN Training Course Flyer
[F] Home Office (2018). Head of Policing and Police Resources Team - Factual statement
[G] Office for Students (2018) Head of Profession and Head of Research Analysis - Factual statement
[H] HEFCE/Unistats (2015). (i) Consultation: Data publication thresholds and aggregation on Unistats Technical advice from Browne (para 41), and consequent proposal to lower the publication threshold (para 63) & provide contextual information on methods & statistical uncertainty (para 64).
(ii) Outcomes of the consultation on data publication thresholds and aggregation on Unistats and for the NSS
[I] Education Endowment Foundation (2014). Chatterbooks Evaluation Report and Executive Summary
[J] UK Data Service (2019). Director of User Support and Training - Factual statement
Additional contextual information
Grant funding
Grant number | Value of grant |
---|---|
n/a | £99,000 |
n/a | £45,000 |
n/a | £115,000 |
ES/K007246/1 | £786,000 |
ES/G034834/1 | £1,100,000 |
RES-000-23-1190-A | £174,000 |
n/a | £650,000 |
ES/F031904/1 | £1,200,000 |
ES/I025065/1 | £300,000 |
n/a | £300,000 |