Mining Science Data for Medicine


Welcome to the website for the Mining Science Data for Medicine or the MiSDaM project.

LATEST NEWS ==== NEWS ===== NEWS ==== NEWS ==== NEWS ==== NEWS ==== NEWS ==== NEWS === NEWS !!!!!!!!!!!!

MiSDAM01 Challenge Announced – see this website and download challenge handbook from workshop site.

MiSDAM01 Challenge Introductory Workshop  -  16 April 2019.

Registration and further information at


This project resulted from a Sandpit Event organized by the STFC  Global Challenges Network+ in Advanced Radiography

You can learn more about this network at the following website -

What are we trying to do ?

Ř  Obtain interesting medical science results with potential to apply to individual patients.

Ř  Create a community of data miners to support the analysis of big data associated with medical science.

Ř  Identify the algorithms and visualisations that are useful in this science area.

Who are we ?

The founding team for this project is,

Stephen Watts ( Coordinator)               The University of Manchester

Miriam Berry                                             NPL, University of Cambridge

Alfred Oliver                                             Patient Representative

Ken Raj                                                            Public Health England

Marina Romanchikova                              NPL, Teddington

How are we going to do this ?

By releasing Challenges to the Global Community and inviting anyone to solve a specific problem using data mining.

The first challenge – MiSDaM01

We have followed a similar format the GREAT08 Challenge in astrophysics. However, there is no leader board or competition. It is a collaborative challenge with teams using different methods to encourage friendly competition. The teams need to register, agree to the rule conditions, obtain the data, and then attend a workshop in September 2019 to present their results. A summary of the results will then be published with all teams involved and methods compared.


The first challenge is in two parts, and participants are welcome to join in either or both parts.


The MiSDAM01 Challenge Handbook can be downloaded at the workshop website – see below


1)      Identification of DNA methylation–based marker of cellular senescence.


DNA methylation-based markers are linked to the epigenetic clock theory of ageing.  Is there a link to cellular senescence ?


Markers have been obtained for 24 control and 24 irradiated cell populations from human donors. 850,000 specific CpGs have been obtained for each sample.


Background Reference

DNA methylation-based biomarkers and the epigenetic clock theory of ageing.

Nature Reviews Genetics. 2018 Jun;19(6):371-384. doi: 10.1038/s41576-018-0004-3

2)      Explaining Machine Learning (ML) results to patients and doctors


The ability of patients to understand and doctors to understand and explain ML predictions, especially for complicated situations with major healthcare consequences,  is an important and topical issue. See the background reference.


Teams are invited to suggest how ML results can be explained to patients, doctors and the public using


Either a) their analysis of the cellular senescence data from Part 1


      or  b) use the much smaller and well known CORIS dataset on coronoary heart disease, which can be downloaded at              



     or       both.

Background Reference

“ Clinical applications of machine learning algorithms: beyond the black box”

David Watson et al. BMJ 2019;364:l886 doi: 10.1136/bmj.l886 (Published 12 March 2019)


How to register for the challenge ?


Download the Challenge Handbook from the workshop website for instructions.


Email the coordinator at Stephen.Watts AT