Reproducible research and audit is an important part of day to day gastroenterology. Automating audit can also provide a huge time saving for units and lend itself well to the quality improvements that units strive for without being burdensome. Unfortunately the tools for reproducible and automated audit and research are beyond the skillset for most clinicians as they are not a standard part of clinical training.
The Institute therefore has two aims for gastroenterological data analytics training:
Understanding Gastroenterological datasets
Gastroenterological datasets are fairly similar to many medical datasets in so far as they as predominantly episode based (as opposed to patient based). The implication is that patient episode identifiers are usually a combination of the patient’s unique identifier and a date.
Once this basic tenet of data organisation is understood, much of the downstream analyses are simple to understand. A framework for the understanding of the different analyses types as applied to practical gastroenterology explained here.
This framework provides the structure for the published and open source toolwhich analyses endoscopic and pathological datasets according to this conceptual framework.
Understanding the basic analytical tools
A lot of large dataset analysis is coded ie. scripted using a coding language. This is a very efficient way to share methodologies and reproduce analyses. Of course the language has to be learnt which requires a learning curve.
It is the aim of the Institute to provide basic resources for the practising clinician to learn one of the main data analytics languages – R so that the clinician can create and run his or her own analyses. Some of these resources are already online and through other collaborators such as NHS-R we contribute to the dissemination of tutorials for the practicing clinician.