Selected Publications

See all publications

Medical data is increasingly moving into electronic formats. Endoscopic data in the UK is almost exclusively electronic. Pathology …

Medical data is increasingly kept in an electronic format worldwide (Bretthauer M. 2016).This serves many purposes including more …

Introduction: Most electronically stored endoscopic reports consist of semi-structured free text. containing a significant amount of …

Our mission

Training clinicians in formal analytics of medical data to encourage reproducible audit and research.

Determining publically available datasets and how to link them to power patient focussed studies

Institute for the development of reprodicible methodologies for the analysis of large dataset gastroenterology data

Development of synthetic and open sourced datasets to encourage cross field open source solution development



EndoMineR for the mining of endoscopic and pathological datasets

Gastro Book

Online and free gastroenterology textbook

PhysiMineR for the extraction of upper GI physiology data

A package written in R for the analysis of upper GI physiology dataset

Synthetic Endoscopy data

A methodology for the creation of synthetic Hospital Episode Statistics data

Synthetic Hospital Epsiode Statistics Data

A methodology for the creation of synthetic data.


The following course provides a basic overview of the use of R and its application to gastroenterological problems. It includes a synthetic data set so the code can be tried straight away.

  • Gastroenterology datascience: A free online tutorial to give you the basics of data sciences, exploratory analysis and data visualisation unsing R, the most commonly used data sciences language. This is applied specifically to gastroenterological data but the principles can be applied to almost any data set


  • 02071887188
  • Gastroenterology Dept, St Thomas' Hospital, Westminster Bridge Road, London SE1 7EH
  • Monday 08:00 to 18:00 or email for appointment