Removing the pump handle – stewarding data at times of public health emergency
By examining our past, we can find lessons for our future - avoiding pitfalls and ensuring equitable outcomes.
8 April 2020
Reading time: 8 minutes
‘History doesn’t repeat itself, but it does like to rhyme’, in the words of nineteenth-century American writer and social commentator Mark Twain. Seeking to understand the implications of the current crisis for the effective use of data, I’ve drawn on the nineteenth-century cholera outbreak in London’s Soho to identify some ‘rhyming patterns’ that might inform our approaches to data use and governance at this contemporary time of public health crisis:
Data has a central role to play in saving lives: the effective use of (and access to) data matters in enabling timely responses to public health emergencies
Where better to begin than with the work of Victorian pioneer John Snow? In 1854 Snow’s use of a dot map to illustrate clusters of cholera cases around public water pumps, and of statistics to establish the connection between the quality of water sources and cholera outbreaks, led to a breakthrough in public health interventions – and the immediate removal of the Broad Street water pump identified as causing cholera deaths in Soho.
We owe a lot even now to Snow – take, for instance, transport app CityMapper’s rapid redeployment of its aggregated transport data. In the early days of the pandemic this formed part of an analysis of compliance with social distancing restrictions across a range of European cities. The US-based health weather map uses anonymised and aggregated data to visualise fever, specifically influenza-like illnesses – data that has helped model early indications of where, and how quickly, COVID-19 is spreading.
Another example of a real-time resource is the database established by John Hopkins University – a web-based dashboard that has enabled us to map (very rapidly) the growth rate of incidences of COVID-19 deaths and recovery cases across the world. At a national level, there is the Against Covid 19 Singaporean dashboard, which traces how the outbreak is evolving over time, as well as its impacts in relation to age and gender. Many of these tools have been helping to support policymakers and national healthcare systems to predict and plan for demands on their healthcare systems, and to adapt their strategies as the pandemic unfolds. Access to data, and the ability to use it effectively in a timely fashion, matters – as historic and contemporary examples show.
Ethics and human rights are foundational to enabling trustworthy data use (yes, even, and especially in times of crisis)
As the current crisis evolves, many have expressed concern that the crisis will be used to justify the rapid roll out of surveillance technologies that do not meet ethical and human rights standards, in the name of generating ‘public good’ outcomes – examples of these include symptom tracing and contact tracking applications. Contributing factors to these concerns might include a lack of trust in the institutions responsible for the governance of this data to use it ethically, and some governments’ lack of preparedness contributing to the risk of a knee–jerk panic reaction.
Against this backdrop, privacy experts are increasingly concerned that governments will be trading off more personal data than is necessary or proportionate to respond to the public health crisis. There are many ethical and human rights considerations that are at risk of being overlooked in the process (listed at the bottom of this piece).
The answer to resolving these is not to press ahead regardless, ignoring legitimate concerns about rights and standards, but to begin asking how we can prepare (now and in future) to establish clear and trusted boundaries for the use of data (personal and non-personal) in such crises.
Democratic states in Europe and the US have not, in recent memory, prioritised infrastructures and systems for a crisis of this scale – and this has contributed to our current predicament. Singapore, which suffered the 2003 outbreaks of SARS and H1N1, put its learning into implementing pandemic preparedness measures. We cannot undo the past, but we can begin planning and preparing constructively for the future – which means strengthening global coordination and finding mechanisms to share learning internationally. Getting the right data infrastructure in place has a central role to play in this process.
At times of public health emergency, trust in institutions to use data well (embracing the change that follows) also matters
Returning to our Victorian pump handle in the time of cholera: although John Snow had persuaded government officials to remove the pump handle causing the majority of cholera outbreaks in Soho, his own explanation of how and why cholera broke out was rejected for months. The Board of Health issued a report that said, ‘We see no reason to adopt this belief’ – prompting Snow to continue to gather data about cases of cholera, tracing them back to the pump. Scientific orthodoxy at the time dictated the ‘miasma’ theory – cholera caused by breathing vapours in the atmosphere, and it took considerable time for Snow’s hypothesis to be taken seriously. In the meantime, people were falling ill and dying. So there is another side to this famous story. Data taken in isolation is quite literally no panacea. And yet the limitations of dominant data narratives – ‘data is the new oil’, or ‘data is the new water’ – is how they invest so much agency and power in the data, at the expense of the agency and power held by people, cultures and systems.
There can be a disconnect between what the data says we should do, and what governments want to do – other short-term economic and political pressures push against the evidence base, compounding a natural resistance to change. The John Snow Society, at its annual Pumphandle lectures, commemorates, through a ceremonial removal and reattachment every year of a pump handle, the medical world’s ongoing struggle with governments across the world.
In a climate where the legitimacy of good data, expertise and credible evidence has taken a hammering, it is important that we make the case for the central role of data in informing decision making (especially in times of crisis). While data plays a role in responding to specific problems and challenges, in isolation it will have its limitations if our institutions do not find ways of first sourcing then gathering accurate data, understanding deeply the insights that emerge from them and then swiftly reacting to the trends and patterns identified.
These problems are cultural. Making the best possible use of data means we need change–ready systems, a willingness for assumptions and models underlying our evidence base to be questioned and unpicked (which requires more transparency about the assumptions, not less), trust in institutions at a time when levels of trust have declined, and the ability to pivot rapidly and depart from established traditions. If there is one thing this crisis will do – it is to force us to reimagine (yet again) the relationship between data, people and decision making.
Where next?
Those of us concerned with removing the modern–day pump handle are also concerned with what it takes for existing power structures and systems to change, to take new models and emerging data seriously, and to respond to the evidence they present.
While data can save lives at times of global public health crisis (and is already helping to do so), it can only do this effectively if its use, management and governance, even at times of crisis, is underpinned by clear rules (grounded in law, ethics and human rights) about how best to use data; and trust in institutions to use data well. Those rules may be different to those we would propose in more ordinary times, but there are still rules. For us at the Ada Lovelace Institute a very live question, through the Rethinking Data programme, is how we can understand and articulate what good stewardship of data looks like at times of public health emergency.
It isn’t just the data itself that saves lives, but the cultures of trustworthiness, transparency and timeliness that we manage to establish around our data systems in that save lives as well. We cannot afford to overlook either.
Rethinking the data governance ecosystem matters more now than ever.
Considerations for data sharing in public health emergencies
- Purpose limitation: ensuring that the purposes of data collection technologies such as contact tracing are tightly constrained and specific to legitimate use for emergency scenarios in a public health crisis, and are not deployed for other unintended purposes such as surveillance, commercial, marketing, advertising, or other research purposes.
- Necessity and proportionality: ensuring data access and sharing mechanisms are necessary and proportionate to achieve their intended goals, by answering challenges like ‘Was this necessary to achieve the intended goal?’ or ‘Could something less intrusive/invasive have had the same or similar intended effect?’
- Transparency around data collection, and a clear deletion date: There needs to be clear indication of what data is collected, for what purpose, who has access to it and when it will be deleted.
- Clear boundaries to data sharing: Assurances that data access will be limited to what is necessary and will not be shared with commercial or public bodies, such as enforcement and immigration agencies, for purposes outside of emergency use.
- Clear boundaries to use and storage: Assurances establishing that there are clear boundaries for use and storage of personal data, with clear accountability for the parties involved in data processing.
- Anonymisation and data security: Although anonymisation should not be seen as a silver bullet, it’s preferable to use anonymous (or aggregate) datasets to increase protection, and it’s necessary to employ strong information security (regardless of whether the data is anonymised or not). As the European Data Protection Supervisor reminds us in its recent statement on the European Commission’s plan to massively collected telecommunications data, ‘effective anonymisation requires more than simply removing obvious identifiers’.
If you have comments or suggestions about this article, please feel free to take to Twitter to air them.
We are always open to new conversations and collaborations, please get in touch.