What has data anonymisation got to do with shifting power in development?

Eve GrennanMethodology, Research, Rights

Respecting research participants around the globe means protecting their privacy and right to consent and also ensuring their data is well used, say Eve Grennan and David Chiana – as they share newly anonymised data about the impact of Oxfam’s work, which can now be used by anyone. Links to data sets below

researchers
An Oxfam team surveying Rohingya refugees in Cox’s Bazar, Bangladesh as part of a project to redesign toilet and washing facilities for women and girls living in the world’s largest refugee camp. The data collected in such NGO research needs to be handled carefully (Picture: Maruf Hasan/Oxfam)

For the past five years, Oxfam GB has made public and available for download certain individual and household survey data it collects to evaluate the impact of its work around the world. However, recent changes in the laws around data privacy meant, to keep the data freely available, it needed to be more strictly anonymised.

We were brought in as external specialists on a placement at Oxfam to do this. In this blog, we outline why we and Oxfam believe availability of data and the way it is handled is so important – and some of the challenges we’ve faced in anonymising it (and you can find links to the anonymised data sets below).

Why share data?

Oxfam first shared data from past evaluations of its projects in 2016 (see a blog about the launch of the data sharing here) in the hope of increasing learning opportunities across the sector by making the primary data available to other researchers. It has been using this data from individual impact evaluations of programmes and projects around the globe itself in a series of broad Effectiveness Reviews. These themed reviews of Oxfam’s work looked at broad areas such as resilience and women’s empowerment, with results published on Oxfam’s Policy & Practice knowledge hub in reports such as Measuring resilience: lessons learned from measuring resilience in Oxfam’s large-N Effectiveness Reviews, and Still learning: a critical reflection on three years of measuring women’s empowerment in Oxfam. The hope is that others can build on this work and draw useful insights and learning for the development sector from the wealth of data from Oxfam’s work over the years.

How do data sharing and anonymisation link to shifting power?

Oxfam’s goal of shifting power to the global majority (what some refer to as the “global south”) seems to us intrinsically linked with the process of releasing survey data to the public and the anonymisation of this data – and was raised often in our conversations with staff internally. We need to ensure that the time participants spend completing surveys, and the data that they provide, is valued and respected. That means maximising the opportunities for the information to be used, whilst respecting people’s privacy by anonymising it safely (and with their consent to share their data in an anonymous way).

‘Unequal power, ownership and control over data can also be seen as another aspect of global inequality – a central theme of Oxfam’s work and campaigning’

This anonymisation process is of particular interest amid deep concerns around data privacy – with monopolisation of data by tech giants, concerns about surveillance and worries about exploitation of data for profit and other harms. Unequal power, ownership and control over data can also be seen as another aspect of global inequality – a central theme of Oxfam’s work and campaigning. As Shoshana Zuboff says: “Our digital century was to have been democracy’s Golden Age. Instead, we enter its third decade marked by a stark new form of social inequality best understood as epistemic [or knowledge] inequality.” 

Organisations collecting data in low- and middle-income countries must avoid replicating the appropriation of people’s data and invasion of privacy by corporate tech giants. We need to build a reciprocal relationship that shifts the balance of power over data: crucial to this is anonymisation and consent from people about how we use their data, especially under General Data Protection Regulation (GDPR) rules that prohibit the selling of data to third parties without explicit consent.

How did GDPR drive this work?

The placement involved the anonymisation of survey data from Oxfam’s Effectiveness Review series (impact evaluations conducted as part of Oxfam GB’s Global Performance Framework) for release to the public via the UK Data Service. The Effectiveness Review datasets had already been anonymised and released to the public but they needed further anonymisation so that they complied with the 2018 Data Protection Act in the UK, which implemented the GDPR. Importantly, GDPR requires the data anonymisation to be more stringent (see this article about GDPR and its Appendix 4 for more details).

The GDPR rules set out that anonymised data is ‘information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable’. Thus, anonymised data is data that effectively ensures the privacy of participants. 

What did anonymising datasets involve?

A key challenge was deciding how to truncate the data set: removing data that might make a participant identifiable. A key way to do this is to “cap” certain variables – ie group together data above or below a certain value – due to the risks of identification of people at the extreme ends of the spectrum. For example, we might cap household size to put together in one category all the households with say 10 or more people – as a household of 20 people might well be easily identified.

However, there is a dilemma here: truncating decreases the data publicly available and so may limit its usefulness. This dilemma lay behind our decisions to cap several variables, such as the number of assets; number of house rooms; and distance to the nearest market. Making these decisions demanded close attention to detail to ensure no privacy risks were overlooked. That meant paying attention to data labels and notes, as well as redundant data. The work also required us to learn new programming techniques to streamline the anonymisation process (such as loop functions, which can be used to automate randomisation of data).

How might Oxfam take this forward?

Although anonymisation helps Oxfam GB to hold to its values and supports shifting power, one question we have is on which platform should this sit? Currently, the datasets are released to the UK Data Service. Is the UK Data Service known outside the UK? Would other platforms have greater reach? Are there other platforms that could lead to the data being better used by researchers or policy makers from the countries where the data came from in the first place? We hope that Oxfam will examine these questions and adapt how they do their data release in the future.

Nevertheless, we enjoyed our data anonymisation placement at Oxfam GB, and found the organisation’s focus on shifting power exciting and engaging. We look forward to seeing how this continues to play out with Oxfam’s latest Effectiveness Reviews, where it is piloting a new approach that aims to shift power even more, by rethinking whose needs the data serves and how it is used. Finally, please find and use the datasets which we anonymised below!

Author

Eve Grennan

Eve Grennan was an Impact Evaluation Intern at Oxfam GB and has recently graduated from a 'Politics, International Studies and Quantitative Methods' degree from the University of Warwick

Author

David Chiana

David Chiana was an Impact Evaluation Intern at Oxfam GB and has recently graduated with a degree in ’Sociology and Quantitative Methods' from the University of Warwick

You can explore the anonymised data sets at the links below. Please email Jaynie Vonk at jvonk1@oxfam.org.uk if you wish to share findings or have any questions about using Oxfam’s data.

Links to Oxfam data sets

SN 8976 – Household Survey for Evaluation of Nepal Resilience Project, 2016

SN 8977 – Individual Survey for Evaluation of Mali Women’s Empowerment Project, 2016

SN 8978 – Household Survey for Evaluation of Ethiopia and Somaliland Resilience Project, 2015

SN 8979 – Household Survey for Evaluation of Democratic Republic of Congo Livelihoods Project, 2015

SN 8980 – Household Survey for Evaluation of Chad Resilience and Food Security Project, 2014

SN 8981 – Household Survey for Evaluation of Bolivia Livelihoods Resilience Project, 2015

SN 8982 – Individual Survey for Evaluation of Armenia Women’s Empowerment Project, 2015