January 2016

The Asthma E-Lab: Discovering Subtypes of Disease with Model-based Machine Learning

Danielle Belgrave (Imperial College London) 
Christopher Munro (Manchester University)

Venue:  Room A54, Postgraduate Statistics Centre, Lancaster University

Date:  Thursday 21-01-2016, 4 - 5pm 

Along with an exponential increase in the amount of biomedical data being generated, there have been major advances in computationally intensive statistical machine learning techniques to identify patterns within such large data sets. One particular area of interest is using data reduction techniques (such as latent class analysis) on longitudinal profiles of patients’ symptoms to identify meaningful patterns and help disentangle subtypes of diseases. Latent class analysis is a statistical method that allows identification of heterogeneous groups of patients who have different symptom characteristics. Statistical machine learning and pattern recognition has been increasingly applied in asthma and allergic disease, specifically in identifying distinct subtypes of wheezing phenotypes during childhood and of atopy based on skin test and IgE responses. We present novel latent variable models that are applied to understand underlying disease heterogeneity.

To understand the generalizability of these models, we have developed a consortium of birth cohorts across the UK known as the Study Team for Early Life Asthma Research (STELAR). The aims of this project are to develop a web-based Asthma e-Lab which combines rich phenotypic data across these birth cohorts and to develop innovative computational statistical methods to identify novel endotypes of childhood asthma, enabling investigation of endotype-specific environmental and genetic associates and discovery of endotype-specific pathophysiological mechanisms. The e-Lab serves as a data repository for our unified dataset and provides the computational resources and a scientific social network to support collaborative research. All activities are transparent, and emerging findings are shared via the e-Lab, linked to explanations of analytical methods, thus enabling knowledge transfer. eLab facilitates the iterative interdisciplinary dialogue between clinicians, statisticians, computer scientists, mathematicians, geneticists and basic scientists, capturing collective thought behind the interpretations of findings.