Use case
| Reference | Userreq.ChildObesity |
| High-level reqs | Techreq.R |
| Proposed by | Iain Buchan |
| | Taverna 1.3 | Taverna 2.0 |
| Priority | | |
Overview
There has been a dramatic increase in obesity in all age groups, in all
parts of the world. The increase accelerated across the 1990s and continues
to rise today. It is reported in children as young as two years. The
cause(s) of this global epidemic of obesity are largely unknown, and its
consequences are difficult to estimate. We have shown that fast-growing
children are more susceptible than slow-growing children to the 'obesogenic
environment'. The long-term consequences of interactions of growth and the
obesity epidemic are unknown. It is therefore very important to present and
future public health to run large-scale epidemiological studies of obesity
in childhood. Many relevant data were collected as part of routine child
health surveillance in the 1990s and additional obesity monitoring data are
being collected from children now. In order to analyse these data
efficiently, a systematic epidemiological workflow system needs to be
developed.
Overall Goals
1) Identify data sources and governance requirements (e.g. data sharing
agreements).
2) Deconstruct the epidemiological investigations that have been or are
expected to be repeated with data from different localities - reflecting not
only upon conventional epidemiological practice but also upon generic
scientific method and how the research process can be described as a
'research object'.
3) Specify the meta-data for governance, data quality, statistical and
epidemiological analysis.
4) Specify data cleaning/preparation, visualisation and statistical analysis
workflows - summarise these in R scripts.
5) Build a workflow system for the analysis of Greater Manchester schools'
obesity surveillance data using myGrid services - evaluate this against
conventional, less automated alternatives.
Required services
Security;
Data;
Provenance;
R;
Mapping
Workflow outline
Verify workflow application and core data requirements with the owner
Upload data
Apply first pass cleaning rules
Notify owner of workflow and abort if data quality below threshold(s)
Standardise height, weight and BMI data for age and sex using the LMS
formula and the 1990 British Growth Reference
Apply second pass cleaning (within 5 standard deviations of SDS scores)
Notify owner of workflow and abort if data quality below threshold(s)
Store data quality meta-data
Request meta-data about any non-standard fields that the owner has uploaded
Classify records by the International Obesity Taskforce definitions
Summarise data overall and by sub-groups (e.g. gender) - statistics, charts [optional: and thematic maps / geographical information systems such as
CommonGIS?]
Make standard group comparisons and any owner-specified comparisons
Store statistical process and any critical result summaries into the
statistical meta-data
Search for similar workflow runs within the given governance family and
summarise the current run against these
Request and store additional interpretation meta-data from the owner
Appendix