Breaking down health inequalities part II: data bias

style

container-w-844

In our prior article, Breaking down health inequalities, one ‘place’ at a time, we discussed the importance of using data analytics to reduce health inequalities at different levels (system, neighbourhood and place). Our analysis highlighted the significance of biased data and good quality data in obtaining effective population health intelligence. In this article, we will consider the common data issues that could lead to data bias in advanced analytics, such as artificial intelligence (AI) and machine learning (ML) in health and social care.

Bias can have a significant impact on population health, leading to incorrect diagnoses and treatment. One prominent recent example was the case of pulse oximetry during Covid-19, where its reading accuracy was affected by skin colour.

Data collection

The lack of standardisation in data collection across health and social care is a widespread issue. Different data definitions, techniques and coding practices make it difficult to compare data from various sources.

Example 1

Different providers may use varying techniques and instruments to record blood pressure measurements. The auscultatory method can be subject to observer bias and measurement error. While the oscillometric method is less dependent on the observer, it can still lead to variations in recorded values based on the device and cuff size used.

Missing data

Missing data is another major issue in health and social care. This occurs when key data points are not collected or represent incomplete information, either due to system limitations, human error, or other factors. We recently evaluated a national NHS disease prevention programme and were faced with a large proportion of ‘unknown ethnicities’ in the data, for which we needed to adjust our approach accordingly.

Example 2

Ethnicity data may be grouped into broad ethnic categories, such as Black or Asian. This can mask differences between ethnic subgroups.

By overlooking differences between different ethnic subgroups, healthcare services will be unable to address the specific needs of communities, leading to unequal access to care and poorer health outcomes. To capitalise on the usefulness of data collected, we recommend ensuring maximum alignment of collection processes to common practices and data flows.

Data quality

Data quality is critical in ensuring accurate analysis. Data entry errors, inconsistencies in coding, and inaccuracies in measurements can all lead to biased analysis.

Example 3

Some healthcare providers may use the imperial system (eg pounds), while others may use the metric system (eg kilograms). The resulting BMI calculation may be inaccurate, potentially leading to miscalculation of a patient’s weight status.

Algorithm design

Patient data is often used in advanced analytical methods, such as risk prediction models, to predict the likelihood of future health outcomes, for example, hospitalisation. However, these models can be biased if the algorithm uses a dataset that is not representative of the population.

Example 4

Using historical data that reflects disparities in health and social care access and quality can result in a biased risk prediction model that can have lower sensitivity towards certain population groups and can predict false health outcomes.

Using biased models can exacerbate health inequities, as previously disadvantaged population groups will continue to receive inadequate care or under-treatment. To improve the accuracy and effectiveness of risk prediction models and other algorithms, it is important to address bias at the pre-processing stage by using balancing techniques.

How RSM can help

Addressing common data issues, such as the above, is crucial in reducing bias and ensuring accurate analysis to achieve equitable health outcomes. Our analytics team can provide both short-term and long-term help to improve data-driven health solutions and deal with data bias. RSM can help health organisations to improve their data collection processes and evaluate any advanced analytical technologies.

Our tools include:

AI assurance: providing assurance over algorithms and AI technology;
data processing: including data cleansing, data assessment and model reviews;
model development: our Data Science team can develop AI models using R, Python and Alteryx;
assistance with data sourcing from different systems;
creating common data models and data templates;
automating processes: including standardising model and data collection; and
balancing techniques: including methods of dealing with data bias, such as oversampling.

style

container-w-844

/content/rsm-eds-production/socials

style

container-w-844