Report

Participatory and inclusive data stewardship

A landscape review

Roshni Modhvadia , Octavia Field Reid

13 December 2024

Reading time: 242 minutes

Contributing authors: Valentina Pavel

Luke Patterson

Project: Participatory and inclusive data stewardship

Research domain: Public Participation & Research

Keywords: Data stewardship

Public trust

Foreword by Reema Patel, Digital Good Network

This landscape report examines the evolution of participatory and inclusive data stewardship, tracing developments since the Ada Lovelace Institute’s early thoughts about Rethinking data[1] and the publication of the conceptual framework and analysis in Participatory data stewardship.[2]

Rethinking data drew attention to the substantial power dynamics inherent in data collection, use and management, which shape societal outcomes. It reinforces the central proposition that data is not, and cannot be treated as if it is, neutral. When we talk about data, we are also talking about the sociotechnical structures around how it is gathered and stored, and about power – who has the ability to influence outcomes around data, with agency and voice. We are always in the process of constructing data, and our relationship with it is dynamic and often unequal. Choices society makes about the production and use of data reflect the distribution of power and are conditioned by power asymmetries.

Central in my own thinking as a researcher in the early days of the Ada Lovelace Institute was what became the framework for data stewardship, grounded in economist Elinor Ostrom’s vision of a common-pool resource. Ostrom’s vision of a common-pool resource itself centres returning power back to the people to whom the resource relates. This demands that we see data itself as a shared resource that requires careful, inclusive and collective management. Ostrom’s early design principles on the commons emphasised that genuine stewardship must be inclusive and participatory, involving people and society as a whole to support common good.

Participatory data stewardship extended this thinking into a structured framework based on Arnstein’s ladder of citizen participation, advocating that participatory data practices should empower communities, shifting agency and control to citizens. Arnstein herself was a health policymaker based in the USA, and she observed the challenging dynamics of power and control between state, private actors and citizens first hand through her work as a policymaker in the health context.

Arnstein’s paper, ‘A Ladder of Citizen Participation’,[3] highlighted that the true value of participatory efforts lies in whether they successfully shift power structures and amplify marginalised people’s voices, enhancing their sense of agency and control. Inclusion and participation in her eyes are inextricably linked, with the most successful approaches ascribing a high level of agency and power to communities, and the least struggling to do so, or (even worse) reducing that agency or power – what she described as ‘tokenism’ or ‘manipulation’.

The Ada Lovelace Institute had since developed a programme of work exploring a positive vision for data, including a clear role for public participation. As innovation in the governance of data accelerates – especially and in light of the central role of data governance in shaping AI and its outcomes – it remains even more important that diverse communities and beneficiaries can influence data practices upstream, especially on issues where public opinion is unsettled.

This landscape review offers a timely reflection on the conditions that underpin participatory and inclusive data stewardship, as well as the developments – and the many obstacles – that characterise today’s landscape, especially given the post-pandemic surge in interest around participation in data and AI.

Looking ahead, the programme’s next publication will aim to provide practitioners and policymakers with actionable guidance and support for advancing inclusive and participatory data stewardship practices both in the UK and globally. It will also seek to offer renewed thinking on how best to assess the success, impact and effectiveness of their work as it evolves, recognising that this work rarely finishes. This reflects our commitment to building the field of participatory and inclusive data stewardship.

Reema Patel, Digital Good Network and Elgon Social Research
Principal Investigator, Participatory and Inclusive Data Stewardship

Executive summary

Data – that is, information about people and the world we live in – serves and supports many functions in society. Currently, data is foundational to initiatives that aim to improve people’s lives, from understanding the needs of local communities and collectives, to addressing significant societal and economic challenges, such as climate change or better health provision.

Governments, local authorities, institutions, private companies, civil society organisations, communities and groups of people therefore have aspirations to create, access, use and share data for different purposes. This means that they may all find themselves in a position where they are required to responsibly govern or ‘steward’ data.

There are considerable aspirations for data to support innovation, and economic and societal benefit. However, the current landscape is characterised by concentrations of data in silos and market dominance in a small number of multinational companies, and high-profile data-sharing errors and opaque private–public partnerships continue to undermine public trust in responsible data governance.

For example, the 2024 independent Sudlow review of the health data landscape makes the case for data to improve people’s lives – and the powerful insights generated by safely linking and analysing health data – but recognises significant structural obstacles and systemic delays that present barriers to maximising the benefits to society.^[4]

This review, Participatory and inclusive data stewardship, builds on global scholarship and civil society analysis of practice, and specifically on three reports by the Ada Lovelace Institute (Ada) that interrogate the legal, structural and systemic preconditions required for data to deliver public benefit, and for people to make choices about their data.^[5], ^[6], ^[7]

Ada’s working definition of data stewardship has been ‘the responsible use, collection and management of data in a participatory and rights-preserving way’.^[8] The evidence provided in this review demonstrates the liveliness of debate around ‘data stewardship’ theory and practice. This encompasses multiple definitions and understandings of key terms – including data stewardship itself.

Participatory and inclusive data stewardship has two foundations: legal rights, responsibilities and established contractual and commercial law; and participatory practices and norms that are designed to increase participation and redistribution of power or agency. These can work together to achieve a range of stewardship purposes, including rebalancing power over data in the context of a data ecosystem that has become skewed towards large private-sector data holders.

This review also highlights where those two foundations can come into tension. On the one hand, governments are actively setting up new initiatives and partnerships that use the language of stewardship. In some examples, stewardship is primarily used to frame the stimulation and streamlining of data sharing, to foster innovation and mobilise data’s value for economic and societal benefit. At the same time, civil society and academic proponents are developing theory and practice around data stewardship’s capacities to support public benefit while rebalancing power towards data subjects and those affected by the use of data.

It provides an introduction to the utility of participation and inclusion in data stewardship, outlining the complexities of initiatives that require detailed knowledge of both legal mechanisms and participatory practices, and a snapshot of the landscape. As a landscape review, it preserves a neutral stance, while reporting on the perspectives of those working in this field on where and how the landscape has progressed over the years, and where there are still barriers and challenges that need to be addressed.

In particular, this review explores the role of inclusive practice in data stewardship and how this relates to participatory mechanisms. This is motivated by a recognition that – if data stewardship is to enable the potential for data to support societal, economic and environmental good – the distinct role of inclusion, in addition to participation, needs to be examined.

This review recognises that, for the debate to move forward, we need to understand where we are. To do this, it explores how data stewardship is being discussed and implemented, interrogating mechanisms that may facilitate participatory and/or inclusive practices, including the context of regulatory frameworks, and whether existing theory and practice is sufficient to meet the challenges of the current data ecosystem and the needs of data subjects and those affected by uses of data.

This review was produced as part of a joint programme of work with the Digital Good Network and the Liverpool City Region Civic Data Cooperative. The programme has been guided by the following research questions:

What are the conditions across the ecosystem of participatory and inclusive data stewardship?
Where are there distinctions across organisations working on participatory and inclusive data stewardship?
How effective are various mechanisms for participatory and inclusive data stewardship?

The scope of this review focuses specifically on the first two research questions. The research sought to answer these questions through:

Analysis of legal and participatory mechanisms that sets out the foundations of the landscape: the complex combination of existing legal provisions that ensure rights and protections and can underpin participation in data stewardship; and the practices that build out participatory and inclusive processes and outcomes.
A desk-based database review of current practices, using keywords related to stewardship, that provides a broad view across the landscape of intellectual and practical thinking from those already investing in participatory data stewardship mechanisms.
Interviews with experts in the field of inclusive and participatory data stewardship that describe from their experience opportunities and advances in theory and practice, and where greater clarity is needed in order for participatory and inclusive data stewardship to deliver its potential.

The question of effectiveness requires detailed analysis of the significant proportion of data stewardship activity in practice surfaced by the desk-based data review. We describe some representative and novel examples, and provide data relating to those.[9] This research will be undertaken through a subsequent Digital Good Network project exploring participatory and inclusive practices.

This landscape review found:

There is a distinct and emerging field of participatory and inclusive data stewardship, developing within and alongside other data stewardship models.
- The objectives and practices of those developing participatory and inclusive data stewardship are distinct from those of other data stewardship initiatives. They focus on collaborating with and empowering data subjects and those affected by collection, sharing and uses of data, often with a broader goal of supporting public benefit and a more equitable data ecosystem.
- Looking across sectors, participatory data stewardship practice is most mature in applications of data in the health sector, for example, voluntarily donated data initiatives or patient panels – a sector where there is already a high level of regulation and an established tradition of patient and public involvement and engagement (PPIE).
- Most examples of participatory and inclusive stewardship mechanisms identified were data trusts. This points to a preference for a mechanism with a clear legal underpinning, as well as the case for understanding, demonstrating and socialisation of alternative legal and participatory mechanisms that might meet different stewardship purposes.
Participatory and inclusive data stewardship requires further time and investment, to test and potentially demonstrate its contribution to a healthy data ecosystem.
- Data stewardship itself does not have a fixed definition, and different understandings shape how it is used.For example, organisations that understand data stewardship as a responsibility to others (for example, data subjects, affected people or other beneficiaries) will have a different (and probably more participatory, and community- or public benefit-focused) approach than those that define their own standards for stewardship.
- Consistent practices and norms for participation and inclusion have not yet had time to develop: there are not yet mature models to provide sites for knowledge and expertise. Because proposed mechanisms are still largely theoretical or untested in practice over time, there are not yet transferable models or norms for the purposes, mechanisms, practices and outcomes of data stewardship.
- A holistic view of participatory and inclusive data stewardship mechanisms and practices is required, that extends beyond data governance to include all the practical (sociotechnical) infrastructure around the data, including organisational functions or community behaviours.
- There is not yet enough evaluative evidence to assess whether aspirations for participatory and inclusive data stewardship mechanisms are realistic, to demonstrate factors to support the sustainability of these initiatives, or their success – and whether they would deliver value if adopted and integrated systematically into data infrastructure, processes and decision-making.
Legal and participatory mechanisms have different purposes and roles,and understanding and using both – and bridging between the two – has the potential to support building genuinely participatory and inclusive data stewardship.
- Legal mechanisms alone will not ensure that inclusive and participatory objectives (informing, consulting, involving, collaborating and empowering) are embedded in data stewardship initiatives. They play a critical role by underpinning these mechanisms, for example, mandating some aspects of informing and consulting for the public and private sectors, and enabling some aspects of empowering (like the right to data portability).
- Participatory mechanisms bring additional purposes and relational practices to data stewardship that support participation and inclusion, and enable inclusive and participatory objectives to be realised. They support rebalancing power towards data subjects, distributing power and value accrued from data to those affected by its use, and ensuring legitimacy and accountability to data subjects and affected people.
- Top-down and bottom-up data stewardship models have different purposes and practices. Top-down initiatives aim to build insights from data both for economic value and to maximise societal benefits. Bottom-up or community-led initiatives offer opportunities to challenge power asymmetries between decision-makers using data, and the people that the data represents and affects. These purposes use different methods, and lead to different outcomes. This means that just ‘doing’ participation isn’t sufficient – the way it is done matters.
- There is no single model for ‘good’ participatory and inclusive data stewardship, but a baseline would include: consideration and use of legal mechanisms to ensure rights and protections, as well as participatory mechanisms appropriate to the purpose and context, with responsibility and accountability to the requirements of data subjects, holders, beneficiaries and wider publics or affected people.
- Conversely ‘bad’ data stewardship (or ‘data stewardship washing’) would be characterised by: a lack of assumed responsibility or accountability in relation to the requirements of data subjects, holders, beneficiaries and wider publics or affected people. This would risk undermining the legitimacy and trustworthiness of the initiative, and broader public trust in data sharing.
Conditions in the current data ecosystem have not proactively supported the initiation or development of a wide range of participatory and inclusive data stewardship mechanisms, and infrastructure and power so far remains centred in large corporations and platforms.
- The bar for participation in data stewardship is currently high for data subjects and individuals wishing to contribute. For participatory and inclusive data stewardship to become part of the data ecosystem, it must be easier for people who want to contribute meaningfully to understand the goals, administration and benefits of different models, set up or join data stewardship initiatives, or to perform processes like moving their data.
- There is a high bar for setting up and sustaining data stewardship initiatives. Interviewees with experience setting up and studying data stewardship initiatives reported consistent barriers of capacity to support participation and inclusion, and funding for sustainability.
- Legislation and regulation do not explicitly mandate participatory and inclusive mechanisms. Current legislation supports participatory and inclusive data stewardship through protection of rights and freedoms for data subjects. Within the data ecosystem, greater incentivisation of participatory and inclusive approaches may be needed to rebalance power in the ecosystem, or institutionalisation of participation and inclusion in governance practices.
Within participatory and inclusive data stewardship, there are variations in theory and practice in relation to power, participation and inclusion that require further exploration:
- The relationship between participation and power is not linear (more participation does not necessarily equal more empowerment), for example in data-pooling mechanisms, some organisations rely on voluntarily donated data but afford individuals little control over how that data is used, while other models enable dynamic consent of voluntarily donated data and more formalised involvement in governance.
- High levels of participation do not always equate to high levels of inclusion. Questions around who is empowered to participate, and how, are complicated by structural and systemic inequalities. Inclusion is distinct and requires consideration not just that people are involved in decision-making, but – with attention to the context and specificity of the situation – which people are involved, and how.
- Mechanisms for inclusive data stewardship are less developed than participatory mechanisms. This points to the need for more investment into the theory and practice of inclusive data stewardship mechanisms. For example, data stewards acting as trusted intermediaries could have a critical role in the data ecosystem, bridging the gap between participation and inclusion.
White-majority and English-speaking-centred literature and organisations are overrepresented within the literature we identified. This is partly due to methodological limitations , but points to a need to explore, understand and learn across existing approaches in other domains and jurisdictions.
- There is an indication that governance frameworks in white-majority countries that afford more power to individuals do not neatly translate across cultures and geographical areas, and particularly into countries where the majority of people are black, Asian, brown, dual-heritage or Indigenous to the global south. For example, structural inequalities and power asymmetries, such as those resulting from the legacies of colonialism, complicate how data initiatives translate.
- Some discourse assumes a comprehensive adoption of western definitions of data ‘ownership’ as property, which overlooks other cultural approaches to ownership, belonging and stewardship as knowledge relating to different peoples. For example, governance innovations in relation to Indigenous data sovereignty model different approaches to inclusion and cultural knowledge around data and data ownership, where perspectives of value go beyond monetary measures.

While this review does not provide specific recommendations to policymakers and practitioners working in this field, we hope this research will provide a foundation for future research and the development of practical mechanisms to further test the utility and efficacy of participatory and inclusive data stewardship in supporting a healthy data ecosystem that fosters societal benefit.

How to read this report

To understand and explore participatory and inclusive data stewardship and its potential to help shape a healthy data ecosystem, see:

To understand what data stewardship is and the role it can play (Introduction: Definitions; What is data stewardship?)
For an introduction to how participatory approaches can support emerging norms and practices that build on legal underpinnings to create more participatory and inclusive outcomes (Introduction: What is the role of participatory and inclusive practices?)
To understand the current state of the landscape (Landscape review)
To explore perspectives of people working in the landscape (Interviews: 1. Definitions and terms used in this field vary; 3. Participation takes many shapes and forms in the data lifecycle; 4. Mechanisms for inclusion are less developed than mechanisms for participation)
To read a summary of our insights (Conclusion: What can we say about participatory and inclusive data stewardship?)

To understand different purposes and dynamics in the current data ecosystem, and how they shape approaches to participatory and inclusive data stewardship, see:

A summary of structural or systemic issues and barriers (Introduction: What are the current conditions in the landscape?; Challenges for participatory and inclusive data stewardship)
An exploration of different, complementary and competing incentives and purposes (Introduction: Different purposes for data stewardship)
Examples of different purposes (Frameworks for participatory and inclusive data stewardship: Purposes)
Insights from the landscape review (e.g. Landscape review: The relationship between participation and power is less linear than previously conceptualised)
Insights from interviews (Interviews: 2. Purpose matters to comfort with different uses)

To understand how intersections of legal and participatory mechanisms can contribute to building a healthy data ecosystem, see:

How legal rights and protections underpin existing participatory data stewardship mechanisms (Introduction: What is the role of legislation?: Analysis of legal and participatory mechanisms: Existing legal underpinnings for data stewardship)
How different data governance mechanisms are more or less supportive of participation and inclusion (Emerging mechanisms for data stewardship: Table 1: participatory mechanisms; Table 2: Participation-supportive mechanisms; Table 3: Non-participatory mechanisms)
An analysis of existing legislation in the UK and EU, see Existing legal underpinnings for data stewardship and Appendix: Legal underpinnings (EU)
How existing mechanisms and frameworks build on legal underpinnings to support participatory and inclusive outcomes (Table 4: Mapping legal and participatory mechanisms to objectives for data stewardship; Frameworks for participatory and inclusive data stewardship: Purposes, Operational norms and Inclusion)

To understand objectives and activities of bottom-up practices that support participation and inclusion in data stewardship, see:

Examples of participatory and inclusive data stewardship projects (Introduction: Data stewardship in practice)
Analysis of academic, grey literature and practical examples of approaches that seek to collaborate with and empower data subjects and beneficiaries (Landscape review)
Perspectives of academics and civil-society organisations developing and advocating for participatory and inclusive practices (Interviews)
Legal analysis of how rights and protections underpin existing mechanisms, and where participatory practices build additional structures for inclusion and empowerment outcomes (Table 4: Mapping legal and participatory mechanisms to objectives for data stewardship; Frameworks for participatory and inclusive data stewardship: Purposes, Operational norms and Inclusion)

To explore objectives of emerging top-down, large-scale data sharing initiatives, see:

For a description of current ecosystem dynamics and potential tensions. (Introduction: What are the current conditions in the landscape, Challenges for participatory and inclusive data stewardship and Different purposes for data stewardship)

Next steps:

To understand our overall insights (Executive Summary: Insights)
To see recommendations for next steps for research and practice (Conclusion: Observations about the landscape)

Introduction

This landscape review aims to present a snapshot of current theory and practice in participatory and inclusive data stewardship, and to identify and analyse trends and developments. It takes a multidisciplinary sweep through relevant literature and thinking to explore how early examples of theory and practice intersect with motivations for data stewardship that supports individual and public benefit.

The review focuses primarily on the mechanisms and purposes of data stewardship, and who – or whose interests – participatory and inclusive data stewardship could or should serve. This includes how it is related to already available mechanisms and approaches, and how those might be developed or disrupted by current theory and practice.

While acknowledging the emerging nature of the field, it explores relationships between underlying legal assumptions (for example, of lawful, proportionate and necessary processing of identifiable data to ensure groups, rights and interests are represented and protected), and of participatory practices (for example, that if some groups of society are continually underrepresented in data stewardship practices, then power asymmetries in data will persist).

As well as participation, it has a specific focus on inclusion, understanding that these concepts need to be considered in relation to each other, rather than as discrete practices, because participation without inclusion is insufficient to ensure equitable outcomes.

Approach to the research

Considering data stewardship comprehensively (as a set of concepts, practices, expertise and mechanisms that are grounded in social realities) is complex. Our methodological approach has responded to this challenge, and has aimed to simultaneously envision, review and examine the developing ecosystem of participatory and inclusive data stewardship. At the same time, we have iterated and refined our methods, leading to different approaches that reflect and probe into multiple understandings of participatory and inclusive data stewardship.

Early research activities showed mapping current data stewardship development requires exploration of practical implementations of legal, structural and systemic preconditions that determine how data is valued, donated, collected, managed, controlled, accessed, owned and shared. And that these concerns intersect with societal questions of belonging, consent, ownership, responsibility, citizenship, safety, privacy, transparency, individual and civic rights, individual and collective benefits, power and agency, equity and trust.

A question that informed our research design at an early stage was how and where to locate the expertise around participatory and inclusive data stewardship. The evidence shows that such expertise is found in practice-based community and civil society organisations and academia, crosses institutional and social domains, and is – by its nature – grassroots.

Through database keyword searches we identified where those activities were located in theory and practice, and which individuals and organisations were dominant in the literature. From there, we saw differences in the organisations we had identified in the keyword searches and those coming out as examples in the analysis. We used this knowledge to select interviewees who represented different aspects of this expert and informed landscape.

One early outcome was an understanding of the need to locate evidence about current practices in an analysis of preconditions or underlying factors in the wider data and AI ecosystem, and this informed the work on legal and participatory mechanisms. This led us to explore those differences and locate potential tensions between the two dominant areas of activity we identified – which we call ‘top-down’ and ‘bottom-up’. See also Methodology.

What are the current conditions in the landscape?

Data stewardship – the responsible use, collection and management of data in a participatory and rights-preserving way – is a young and developing field, in which different mechanisms are being used and trialled in a variety of models and pilots, for different purposes. There are many small-scale examples originating in academia, civil society and communities that demonstrate good practice and experimentation across a range of participatory mechanisms and practices.

Data stewardship in practice

This review does not go into detail about specific data-stewardship initiatives, but we acknowledge the importance of that evidence and of the developments taking place in participatory practices.^[10] To provide a sense of the breadth of that work, examples of data stewardship mechanisms are described here, selected by interviewees and through our research, as examples that illustrate significant trends in the wider landscape:

The Native BioData Consortium (USA)^[11] describes itself as the first nonprofit research institute led by Indigenous scientists and tribal members in the USA. It stores biological samples and data from tribal members local to their community; builds tribal capacity in science, technology, engineering and mathematics (STEM) research; and aims to build more robust health datasets to investigate health outcomes that benefit Indigenous people.

‘This is an Indigenous-led initiative that’s in part a biobank but also a data repository. They’ve been acting as a third party where they will hold data until a tribal nation works out what it’s going to do with it. There are examples where they have held data temporarily and then it has been deposited with the tribal nation.’
– Maui Hudson, Director, Te Kotahi Research Institute

Abalobi (South Africa)^[12] is a South African-based civil society organisation with international reach that describes itself as a hybrid social enterprise with a vision to develop ‘thriving, equitable, resilient and sustainable’ small-scale fishing communities globally. It offers a suite of fisher-driven technologies to support data collection and application across the fishing supply chain, from electronic catch documentation and traceability through to area-specific marketplace information.

‘A good example of data stewardship that works well is Abalobi in South Africa. They built an app for small fishing communities in South Africa on the coast to track the whole process from the time the fish was caught all the way to when it’s sold. It’s surfaced the role of women in this whole cycle. Because the labour of women – usually as either sisters or wives or daughters – was just assumed, and [the app] surfaced their labour and allowed them to get remuneration for it. This was not data that the government was going to be able to collect at all easily, but it’s a big data gap that they [Abalobi and the fishers] were able to fill.’
– Vinay Narayan, Senior Manager, Aapti Institute

The Data Assembly (New York, USA)^[13] is an initiative from The GovLab to gather diverse and actionable input on data re-use for crisis response in the USA. The initiative began in New York City in summer 2020 in response to the COVID-19 pandemic. It consisted of remote deliberations with three publics: data holders and policymakers; representatives of civic rights and advocacy organisations; and New York residents and visitors.

‘Something we did here in New York City during COVID-19 was to work on accessing data reuse to inform pandemic response and preparedness. The Data Assembly was the first-ever citizens’ assembly around the reuse of data, to inform the City government on what New Yorkers – broadly defined as residents and visitors to New York – felt was appropriate and under what conditions data could be reused. The key element here was who was holding the assemblies? We had public libraries hosting, which was an important design feature.’
– Stefaan Verhulst, Co-Founder, The GovLab (New York) and The Data Tank (Brussels)

Saidot (the Netherlands and Finland)^[14] produces open AI and algorithmic registers that support informing and empowering people by enabling an ‘armchair audit’ of AI systems in Helsinki and Amsterdam. In our 2021 Participatory data stewardship report,^[15] we observed that Helsinki and Amsterdam were among the first cities to announce open AI and algorithm registers. These were founded on the premise that the use of AI in public services should adhere to the same principles of transparency, security and openness as other city activities, such as public bodies’ approaches to procurement.

These registers are openly available, and accessing the register reveals information about the systems that are reported on, the information that is provided, and the specific applications and contexts in which algorithmic systems and AI are being used. Saidot conducted research and interviews with clients and stakeholders to develop a model that serves the wider public, meaning it is accessible not only to tech experts but also to those who know less about technology, or are less interested in it. Saidot has demonstrated longevity as a model: as of 2024 the registers remain in use in Amsterdam and Helsinki.

Our Future Health (UK)^[16] is a collaboration between public- and private-sector organisations and charities that has developed a trusted research environment (TRE) for health data. It aims to be the UK’s largest-ever health research programme, supporting wellbeing and longevity through better prevention, detection and treatment of disease. Our Future Health assures participants that their data will only be used in the context of health and disease, and that they can withdraw consent at any time. The programme has involved members of the public through focus groups, interviews and a public advisory board.

Safetipin (India)^[17] describes itself as a social impact organisation working towards building responsive, inclusive, safe and equitable urban systems with a particular focus on women’s safety. It collaborates with government and non-government stakeholders to use big data to improve infrastructure and services in cities. Data is collected via a mobile phone app in multiple ways. Some of the data is crowdsourced, for example, through a mobile app on a car windshield taking photos or user feedback on safe/unsafe areas. Others are collected on a project basis by, for example, trained Safetipin associates collecting data about accessibility of bus stops.

‘Safetipin is an organisation that maps the safety of streets, and they’ve worked with local school children – particularly girls – who don’t have access to mobile phones, to map street lights and any particular things that make them feel nervous, or things that feel dangerous on their streets on their walk home. So they’ve done ethnographic walks with participants, noting where they feel nervous.’
– Joe Massey, Senior Researcher, Open Data Institute

Aya, Cohere for AI (international)^[18] is a global open science project to create new multilingual models and datasets that expand the number of languages covered by AI, and the quality of the language data. It is one of the largest open science machine learning projects to date, involving over 3,000 independent researchers across 119 countries and 101 languages. The open-source dataset is generated through working with fluent speakers of languages from around the world to collect human-annotated instruction data in relation to specific languages for AI research.

‘Data contribution is not necessarily very participatory, but it is an activity that enables a lot of people to get involved. Cohere for AI’s Aya initiative has built out multilingual datasets that otherwise would not have existed. Many times these organisations have joined primarily for another purpose, whether that’s digital preservation of a language, or wanting the dataset for their own purposes to build language models or technology that’s purpose-built for their languages. It’s been an interesting opportunity for different groups to engage with specific communities to help achieve mutual goals.’
– Jennifer Ding, Senior Researcher, The Alan Turing Institute

The Data Trusts Initiative (UK)^[19] supports pilot projects that bring together researchers and social entrepreneurs to explore how to move models for data trusts from theory to practice. Its aim is to empower individuals and communities, while supporting data use for social benefit. Current pilots include the Brixham Data Trust, which explores facilitation of environmental stewardship, health, wellbeing and net-zero ambitions in the context of a small fishing town. The Born in Scotland Data Trust seeks to tackle the economic and healthcare inequalities affecting communities in Scotland by building an infrastructure for trustworthy data stewardship around a pilot birth cohort study. The General Practice Data Trust is investigating how data trusts could help steward healthcare data in the UK. In particular it focuses on the million people who have opted out of the NHS General Practice Data for Planning and Research (GPDPR) programme due to concerns about how their data will be managed and used, and aims to give them an alternative opportunity to participate in life-saving research.

Driver’s Seat Cooperative (USA)^[20] launched in 2019, to empower gig workers – particularly in the transport sector, for example cab and delivery drivers – with tools to advocate for better pay and to combat the arbitrary use of algorithmic management systems that put them at an unfair disadvantage and undermined their rights. The Driver’s Seat app allowed workers to share and analyse insights from their data, providing transparency about payment, distance and time, to counter discriminatory or unfair remuneration practices. The objective was to build a cooperatively owned and directed business, creating worker-initiated data not only to empower workers but also to inform workplace and transportation policymaking.

The Driver’s Seat app had 20,000 downloads and successfully supported gig workers to represent their rights and campaign for more equitable working conditions, particularly in the San Francisco area. Despite this, it was not able to sustain a profitable business model. However, the cooperative and democratic decision-making model persisted, and members were invited to vote on the closing plans.^[21]

In 2024 the app and tools transitioned to become part of the Workers’ Algorithm Observatory, a crowdsourced auditing collaboration at Princeton University.^[22]

‘In 2021 we did a landscape mapping to identify a bunch of data stewardship initiatives. Last year they went back to see where those organisations are, and I think the biggest thing is that most of these organisations don’t scale too well. Funding is a big challenge – most of them rely on philanthropic fundraising or grants, because there isn’t much monetary value in these models. A cooperative where you charge your members a fee might work better… A lot of models fail to move past the pilot stage, and even when they do, funding continues to remain a concern. Driver’s Seat is a prime example: they’ve been around since 2019 but had to close shop last year due to a lack of funds.’
– Vinay Narayan, Senior Manager, Aapti Institute

Challenges for participatory and inclusive data stewardship

All these activities are taking place in a landscape characterised by imbalances of power, concerns about the safety of new technologies, national and global regulation lagging behind the development and deployment of new products and services, and low trust in data sharing.^[23]^{, [24]} In addition, there are concentrations of data in silos and market dominance in a small number of multinational companies, high-profile data-sharing errors and opaque private–public partnerships that continue to undermine public trust in responsible data governance.^[25]

While we do not suggest that participatory and inclusive data stewardship alone can remove the inequities in the current data economy^[26] – that requires a fundamental restructuring of the institutions and distributions of economic power^[27] – we do propose that it can make a contribution towards disrupting private-sector monopolisation, rebalancing power towards data subjects and those affected by data sharing and use, and supporting collective or public-interest outcomes.

In relation to data, it is understood that ‘in the current data ecosystem, the most valuable data is generated about individuals without their knowledge or control’.^[28] This is because private companies hold disproportionate power in the data ecosystem, often assuming the role of data intermediary through creating and harvesting data as part of their business models. This is particularly true in relation to the African continent, where there are high aspirations for data sharing to develop research and policy in relation to ‘deficit narratives’ that highlight only systemic poverty or inequality. These conversations are commonly pursued by non-African stakeholders, while datasets are extracted from African communities.^[29]

Making space for participatory and inclusive data stewardship would mean ‘disintermediating’ the existing system. This would involve disrupting the ways that companies, including monopolistic platforms, routinely insert themselves into technical, social and economic interactions to create or extract data and control access. Often they do this without creating value for data subjects,^[30] or distributing value through different actors in a supply chain – including, for example, small and medium-sized enterprises (SMEs).^[31]

This power hold is reflected in companies’ internal practices, and the limited extent to which these companies engage with or attend to participatory mechanisms.^[32] There is evidence that this tendency towards a technocratic approach is reflected in the public sector, where ‘data stewards’ take responsibility for communicating findings to diverse audiences, such as other public servants or communities, but assume that people with non-technical backgrounds may struggle to understand complexities.^[33]

The lack of established models for participation and inclusion presents another challenge. For those wishing to set up participatory data stewardship initiatives, it is necessary to understand and be able to mobilise a range of knowledge and skills. Existing frameworks do not yet capture communicable and transferable norms. An exception is the Aapti Institute playbook, which is presented through challenges experienced on the ground, and potential strategies or solutions in relation to ‘sector-specific nuances, requirements and realities’.^[34]

There are also longer-term examples from the health sector, where patient and public involvement and engagement (PPIE) practices do have established norms.^[35] Recent research in the health sector provides learnings for new participatory mechanisms like data trusts; for example, the process of involving people in the governance of large-scale biobanks. This highlights the need to attend to who is involved and their interests, who speaks on behalf of whom, distinctions between – and the existence of – multiple publics.^[36]

However, a study of public attitudes, hopes and concerns in relation to the use and sharing of NHS data found that – despite these more established norms – the public want more transparency, accountability, public participation and fairer distribution of benefits.^[37] Dynamic (rather than ‘informed’) consent, which enables data subjects to maintain control over time in relation to how their data is (re)used, for which purposes and by which actors, is well researched in relation to biobanks but not yet an embedded practice in other contexts.^[38]

The rise in interest in generative AI technologies brings novel challenges for participatory data stewardship, including the increasing prominence of debate around AI, algorithmic and machine-learning technologies’ safety and governance. While this focus has overshadowed recent public discourse around data, it also highlights the essential role of data and datasets in the development and deployment of AI technologies, in model training and processes like prediction.

The pervasiveness of AI and the distributed nature of the supply chain, which can involve multiple companies as developers and deployers of AI, or holders and users of data, requires new models for meaningful stewardship in practice at different decision-making points.^[39]^{, [40]}There is optimism that participation can overcome the perception that AI – even as it develops into increasingly human-interaction spaces – is disconnected from social governance.^[41] And that participatory practices can turn the current appropriation of scale by multinational corporations into infrastructure and resources that distribute power and enable localised communities to represent their own interests.^[42]

In the absence of participation-focused regulation, AI companies are making their own decisions in relation to appropriate protections for creators and holders of data,^[43] and – with some notable exceptions, such as those building open-source datasets – show little evident concern for empowerment through meaningful participation.^[44] Many AI companies have been training their models on publicly available (if not always openly licensed) data, and are now looking to increase the performance of their models and market privilege by buying access to private datasets.^[45]

One example of a resource that is at risk of being undermined is the New Zealand-based Te Hiko database and transcription software, developed by and for the Indigenous community to preserve and share intergenerational recordings and knowledge about the te reo Māori language. Managed as a pooled resource, the collective has protected its irreplaceable dataset through a hosting and licensing arrangement, but recognises its vulnerability to AI model training and is calling for revisions to intellectual rights and copyright law.^[46]

More recently, there has been a focus on building large-scale, public data infrastructure through a burgeoning of public interest AI discussions.^[47] In practice, this means that public institutions, service-providers and private companies – which have started to amass considerable power and/or responsibility through holding or using data – wish to build insights from data both for economic value and to maximise societal benefits.

With this comes a responsibility to balance the potential of economic and societal benefits through:

Mechanisms that will enable those in stewarding roles to ensure protections of fundamental rights and freedoms encoded in law – such as privacy, data protection and safety.
Mechanisms that will enable inclusive consideration of people whose data is included in these large datasets (albeit anonymously and beyond the reach of data protection legislation), and who may be affected by decisions based on large-scale analysis and insights.

In these large-scale data-sharing initiatives that steward smart or non-personal data, there is a narrow definition of stewardship emerging that removes the sense of (fiduciary) responsibility. Instead they use language of working towards an abstract notion of good or public benefit that is not legitimised by a relationship with the needs or interests of data subjects or beneficiaries. In addition, these initiatives tend to use mechanisms at the low-participation end of the spectrum, in accordance with legal requirements, but not using participatory mechanisms. The result of this is a risk of potentially unparticipatory and uninclusive data stewardship emerging that could be called ‘data stewardship washing’.

Definitions

Data stewardship is itself a contested term, even without the layering of participation and inclusivity. In addition, legal and normative definitions are used to refer to different actors in the data stewardship supply and value chains. This review uses the following terms:

Data: refers to information about people, activities and the physical world. Data cannot be neutral: when we consider data, we must also consider the sociotechnical structures around its collection and use. An important distinction in relation to how it is stewarded is whether it is subject to specific legal requirements and processing as personal data (‘related to an identified or identifiable person’ as defined by the General Data Protection Regulation (GDPR)) or is subject to less scrutiny as non-personal data (such as anonymised data, sensor data of traffic flows or Internet of Things).

Data intermediary: any mechanism or service that seeks to manage and intermediate relationships responsibly between individual data subjects, data holders and data users (see definitions below). Note: while many corporate platforms are currently technically de facto ‘data intermediaries’, our definition of data intermediary assumes a purpose beyond direct commercial value from data sharing, which benefits data subjects, holders, affected people and wider society, as well as those who make use of the data.

Data steward: the independent person, organisation or institution that creates a trusted environment which supports decisions about access and sharing of data subjects’ and data holders’ data. This will include supporting them to exercise their rights, make informed choices, consult and exchange views on data processing purposes, and negotiate terms and conditions for data processing on their behalf. Incentives for data stewards vary according to the purpose, terms and objectives set.

Data subject: the individual holding rights over their personal data, for example the right to access, erasure, rectification, portability.

Data holder: the person, organisation or institution who has obtained the right to grant access and share data to which the data subject(s) has / have rights.

Data management service provider: the organisation providing the service through which an individual’s decisions about their data are operationalised. The service provider makes decisions over the scope and design of the technical architecture of decision flows, and is generally not involved directly in the individual’s decision-making process. These organisations do not take an active stewardship role but can influence positively or negatively through the interface design (for example, the platform could nudge individuals to consent to sharing data for certain causes that might involve public benefit).

Data user: the person, organisation or institution receiving access to data and using data to which the data subject has rights.

Affected people: anyone who might be affected positively or negatively by data-processing practices.

‘Bottom-up’ data stewardship: any mechanism where decision-making power is held by the data subjects, and entry and interaction into relationships within the existing data ecosystem requires legal or administrative arrangements to empower data subjects to take a more active role in stewarding data about themselves, by granting access under specific conditions or protecting sensitive data.^[48] These stewardship arrangements would 1) codify consultation and participation in the foundational governance documentation and 2) enact the consultative and participatory mechanisms set out in the governance framework.

Top-down data stewardship: any mechanism where decision-making power is located with the data holder (such as platforms, Government, local government or health and welfare infrastructure), and equitable activity in the existing data ecosystem requires mechanisms to facilitate the participation of data subjects and / or affected people (as part of, or in addition to, a legal or administrative structure). These organisations might be combining or linking data from multiple sources, or acting as a gatekeeper for data held by other organisations.^[49] Other descriptors used are private or civic data trusts.^[50]

Participation: the involvement of people in meaningfully influencing and shaping or effectively making decisions that affect their own lives and wider societal outcomes. Meaningful participation in data stewardship will reflect the motivations and aspirations of those participating, so there is no single model – in some situations, people will want others to take responsibility for, and carry out, work required to ensure their views are heard and taken into account, and in others people will want to play an active role. Therefore, it is necessary to frame participation in terms of a range of outcomes: informing, consulting, involving, collaborating or empowering people.

Inclusion: the ability of diverse groups of people and individuals to meaningfully participate in the stewardship of their data, with a secondary focus on how this participation brings individual or wider community or societal impacts. Inclusion is relevant to both the process and outcomes of data stewardship: excluding individuals and groups of people from the design of mechanisms prevents their participation in decision-making, and excluding them from lawful, necessary and proportionate data collection prevents their representation in initiatives built on those datasets – undermining public benefit. Inclusion requires a specific focus on equity and rebalancing power.

Note: our report Participatory data stewardship used the collective term ‘beneficiaries’ to refer to a number of different actors who might be affected by the use of data and might have the potential to benefit from a participatory data stewardship approach. The intention was to move beyond a ‘compliance-based approach’ to one underpinned by social licence. Our current thinking is that distinguishing data subjects and affected people in the context of specific datasets enables a clearer understanding of relevant differential legal, technical and organisational factors operating in the current data ecosystem in relation to data stewardship. Legal protections are grounded in the identification of individual rights holders and affected people in specific contexts, and beneficiaries is used to indicate those – including individual rights holders and affected people – who stand to benefit in any way from specific, normative, participatory mechanisms.

What is data stewardship?

Data stewardship is a relatively new concept: it has been a recognised mechanism for the responsible and trustworthy management of data among national and local policymakers, civil society and communities for less than a decade. How stewardship is defined in relation to data-sharing initiatives matters, because it determines how responsibility, legitimacy and accountability are conceptualised and operationalised.

Stewardship encompasses practices that are rooted in legal protections, such as data protection law that ensures a blanket level of protection for data processing practices (the section below also discusses participatory elements in data protection law), or trust law that can serve as basis for new data governance models such as ‘data trusts’ with embedded fiduciary obligations. Stewardship also encompasses social and organisational norms that extend beyond the management of data into ways of working and standards of practice.

In the UK, the push towards seeing data intermediaries (for example, data trusts) as a realistic mechanism to facilitate data sharing for economic growth, encourage competition and develop a trusted and safe way to share data, dates to around 2016. Since then, global initiatives have developed that bring prospects of multiple new institutions and mechanisms for involving and empowering people through data stewardship, shifting the focus from who owns the data to who takes responsibility for its use, and on whose behalf.

As this review demonstrates, there continues to be an established and live discourse around ‘data stewardship’ theory and practice. This encompasses multiple definitions and understandings of key terms – including data stewardship itself: data stewardship can be understood as both an institutional approach, and also a set of functions and competencies. The Ada Lovelace Institute’s (Ada’s) working definition of data stewardship has been ‘the responsible use, collection and management of data in a participatory and rights-preserving way.’^[51]

This implies that data stewardship should be understood as embodying a principle of responsibility – not just to the interests of one’s own self or organisation, but to the interests of others. This relationship can be ‘fiduciary’, denoting a specific legal responsibility with – for example, in the case of data trusts – obligations to ensure trustees act in the best interests of those creating the trust.

The relationship between stewardship and governance is similarly contested: some see stewardship as a subset of data governance, some focus on stewardship’s focus on a societal goal as opposed to ‘governance’s focus on processes for making decisions and exercising power,^[52] while others see stewardship encompassing data governance as ‘the process by which responsibilities of stewardship are conceptualized and carried out’.^[53]

Using the term ‘data stewardship’ can indicate an approach to data management that builds on governance – the structural conditions that enable data to be used in trustworthy and inclusive ways – towards an approach that broadens compliance into consideration of rights, responsibilities, common good and mutual or public benefit.^[54]

This conception of stewardship has its roots in the work of economist Elinor Ostrom, who used stewardship to identify harmful and beneficial practices in the governance of natural resources.^[55] Data is not a traditional resource – and there are some properties of data, such as its ability to be shared and re-used, that make it distinct from natural resources. However, stewardship has become established as a useful positioning concept to enable consideration of how data is collected, used and governed in systems of power inequalities, for what purposes and in whose interests.

These definitions are culturally conditioned, and there remain substantial differences between knowledge and conceptions of Global North and Global South countries, policymakers, companies and researchers – as well as a tendency to present dominant (Global North) narratives as implicit norms. Continuing ‘data colonialism’ freights many assumptions of unjust historic actions, perpetuating the deficit model by overlooking plural, community-centred data practices, values and traditions, as well as scientific and cultural developments within the 54 African nations.^[56]

Accepting the need for a ‘decolonial approach to data’ that broadens the range of possible models and solutions beyond western-centric assumptions,^[57] there are still inconsistencies in the definition, responsibilities, goals and relationships implied by data stewardship. Some see stewardship embodied in distinct organisations and processes, and others see it as an active, negotiated and relational dynamic between interested parties. Data stewardship can therefore be both an institution and a set of practices.^[58]

Data can be seen to have different relationships to societal norms of ownership and property, and many initiatives seek to maximise the collective, societal value of data as a ‘commons’, rather than its individual, property-based or monetisable value. Notions of data ownership, which stem from equating data governance straightforwardly with property rights^[59] rather than human rights law,^[60] differ depending on relationships to empowerment and cultural geography. Some arguments recommend strengthening individual control, while others recommend increasing collective control and increasing the prominence of public value or benefit.^[61]

The mechanisms for stewardship are also unsettled and evolving, despite specific work to review and typologise intermediaries that aims to add to the knowledge base and identify emerging types, to contribute to common terminologies.^[62] The different views of data stewardship in practice reflect the positionality of interested organisations, their place in the context of current debates and the purposes they propose. The UK Centre for Data Ethics and Innovation (now the Responsible Technology Adoption Unit) defined data intermediaries in 2021 as ‘a broad term that covers a range of different activities and governance models for organisations that facilitate greater access to or sharing of data’.^[63]

In 2021 the GovLab called for a reimagination of data stewardship to encompass ‘functions and competencies to enable access to and re-use of data for public benefit in a systematic, sustainable, and responsible way.’[64] In 2022 the Open Data Institute recognised the need to disambiguate data institutions and intermediaries, recognising the similarities (both terms recognise organisations that empower individuals as well as organisation-to-organisation approaches; and both are mechanism-agnostic). Intermediaries may facilitate data-sharing, and institutions take on a broader set of functions, including stewardship and the responsibility to steward data on behalf of a particular sector or community.^[65]

In considering setting up a central government data ethics body, the Royal Society focused on the responsibility embodied in a stewardship relationship, describing it as a body mandated to ensure responsible use of data.^[66] In the Mozilla Foundation’s introduction to concepts, it took a more power-differential view, describing stewardship as the act of empowering agents in relation to their own data.^[67] The Open Data Institute described it as embodying a ‘fiduciary relationship’ of trust between data stewards and data beneficiaries^[68] – it has now updated this definition to ‘the collection, maintenance and sharing of data’.^[69] And the Aapti Institute described a process designed to empower individuals and communities while protecting rights: ‘a paradigm which explores how the societal value of data can be unlocked while considering what it takes to empower individuals/communities to better negotiate on their data rights’.^[70]

Definitions of data stewardship centre on the positionality of the power-holder, with an assumption towards governance models that do not involve state or market actors. Early conceptualisations involved a tacit assumption that the trusted relationship implies a ‘fiduciary’ duty in which the interests of individuals or groups of people are protected and cared for by a responsible person or body: ‘The concept of a data steward is intended to convey a fiduciary (or trust) level of responsibility toward the data.’^[71] This model carries assumptions about who might assume that trusted role – usually an organisation or professional operating under a legal responsibility.^[72]

Although there is no fixed definition of ‘data stewardship’, it is intimately tied to existing legal protections, such as privacy, security and human rights. However, these legal bases for governance have been supplemented by normative activities, behaviours, approaches or frameworks. These are frequently found in academic, civil society and community models, and are specifically designed to foster and enable the creation of participatory and inclusive data stewardship models and practices, which increase agency and support empowerment of people.

In practice, data stewardship cannot be structured effectively through only legal or participatory means: it is the combination of legal and participatory mechanisms and practices that produce the normative rules and behaviours that structure specific data stewardship mechanisms. These norms instantiate the equity and inclusiveness of a particular stewardship mechanism and the agency it provides, while also providing the relational aspect of reliable governance mechanisms that protect the rights and interests of those who currently hold less power in the data economy (for example, data subjects and civil society organisations who represent their interests, small companies and local public authorities).

Emerging mechanisms for participatory and inclusive data stewardship

To understand the current ecosystem of participatory and inclusive data stewardship, it is necessary to understand their legal governance mechanisms, relative to power-holding and sharing, and how these support more-or-less participatory processes and outcomes (whether participants are informed, consulted, involved, collaborated with or empowered).

The three tables below illustrate the range of data governance mechanisms currently in use and their degree of participation.

Table 1 shows mechanisms that are designed towards participation.
Table 2 distinguishes ‘participatory supporting’ mechanisms, meaning mechanisms that allow and can be optimised for increased participation if participants wish to have higher levels of engagement.
Table 3 lists non-participatory mechanisms, which nevertheless represent options for more choice over data management or increased data sharing.

Figure 1: Overview of data governance mechanisms

The following definitions for data trusts, data cooperatives, personal information management systems (PIMS) through to data-sharing pools are based on Ada’s work,^[73] and the European Commission’s Joint Research Centre’s map of data intermediaries.^[74] The remaining definitions build on the Centre for Data Ethics and Innovation (now Responsible Technology Adoption Unit) independent report on data intermediaries^[75] and other relevant scholarship.^[76] ^[77] ^[78] ^[79]

Note that there is not currently a mapping of participatory mechanisms to mechanisms of inclusion, but that participatory mechanisms are not inherently inclusive, and can create conditions in which groups or communities feel excluded or disconnected from each other, as well as creating communities in which people feel included.^[80]

Table 1. Participatory data governance mechanisms

Data cooperatives, data unions and data commons mechanisms are all intrinsically participatory mechanisms, working with or initiated by data subjects and designed to involve them collectively in decision-making about the holding and use of their data. These mechanisms tend to be aligned with outcomes relating to collective or public benefit.

Data governance type	Definition	Governance mechanisms	Participatory mechanisms
Data cooperatives	A cooperative is an established legal mechanism, typically formed where there are collective interests that are better pursued jointly rather than individually. Cooperatives can take many different legal forms depending on the jurisdiction: for example, under UK law, there is no specific definition of a cooperative. The most widely used form is a cooperative society, but they can also be set up as a private company limited by shares or a private company limited by guarantee. The legal form will depend on the level of liability members are willing to expose themselves to, and the way members want the cooperative to be governed. Data cooperatives are emerging intermediary structures where data is stewarded for the benefit of its members as individual data subjects (who might also see their interests as benefitting wider society).	Data cooperatives are a decentralised form of data intermediary owned and run by members. Membership is open and voluntary, with an equal number of votes and the benefit is shared among its members. It is an autonomous and independent form of governance where data is stewarded in the collective interest of its members, and depending on the chosen governance form, it can advance the interests of all members at once, and/or it might be chartered to achieve consensus over whether an action is allowed.	Data cooperatives are highly participatory, allowing individual data subjects to have a direct say in how data is used and governed, steering its use according to their motivations, preferences and concerns. If a new ecosystem of data cooperatives were created, this would create differentiation and choice for data subjects.
Data unions	A data union is a proposed new mechanism for data stewardship: a type of intermediary that positions itself between its members and the data platforms on which they perform economic activity (for example, Uber, Airbnb and similar entities). Data subjects grant the union (the intermediary) the exclusive right to use the data generated through the platforms. In a data union regulatory structure, platforms would not be entitled to access user data as it would be deposited in a union. To access data it would have to negotiate terms of access with the union. Note that data unions are emerging, for example, Workers Info Exchange, but because of adverse conditions in the ecosystem they are not yet able to reach maturity as a collective representation model.	A union is a formalised governance framework organising people (normally workers) for collective bargaining and representation. Data unions transfer the democratically governed labour union model into a data context. Unions could have regularly elected leadership that implement the policy goals of the union. They could be organised by geography or by policy, and funded through fees for access to data. Note: the model presented here is distinct from the model of a formalised, decentralised governance framework that enables individuals to join together, control and potentially monetise their data – for example, people can contribute driving data to a data union, to be sold to insurance or mapping services.	Data unions are highly participatory. Participation can take the form of collective gatherings for voicing opinions and setting terms, and representatives can be delegated to negotiate and act on member’s behalf. Data unions can allow communities to decide their own data policies, for example, currently marginalised communities.
Data commons	A data commons is an emerging mechanism: a collective set of resources that are governed based on economist Elinor Ostrom’s principles. A data commons is a community that collectively and sustainably governs data. This may be motivated by sharing data between organisations (such as researchers) to collectively solve problems, or may go beyond problem-solving into sharing languages and cultural understanding (such as Indigenous communities).^[81] A data commons implies that the body of (data) resources would grow or decline independently from the number of stakeholders and their individual data. This model assumes the commons will support community or societal benefits.	Data commons are informal structures, with organisation through social norms and institutional arrangements. Data commons can have different governance arrangements, which negotiate individual and collective entitlements over data as resources that are placed under common control.^[82]	Data commons can be highly participatory, and enable different degrees of participation in a data commons can be flexible, with members of the community getting involved as much or as little as they wish and have capacity to do so.

Table 2: Participation-supporting data governance mechanisms

Data trusts and data-sharing pools can be more or less participatory, depending on specific governance choices: some trusts or pools will choose to empower data subjects to participate collectively in all aspects of governance and decision-making, while others will establish rules, norms or ways of working collaboratively at the outset and not require further substantial input. Their outcomes and objectives can be more or less aligned with collective or public benefit.

Data governance type

Definition

Governance mechanism

Participatory mechanisms

Data trusts

A trust is an established legal mechanism under common law where assets (including data rights) can be placed under the control of a trustee who manages these assets on behalf and for the benefit of its beneficiaries/parties creating the trust. Data trusts are an emerging mechanism, where the exercise of data rights is the asset placed in trust by data subjects or data holders.

Data trusts allow individual data subjects or data holders to pool data rights and proactively determine the terms of use in accordance with their objectives and intentions (including to support wider public interest) as a way to rebalance some of the power asymmetries in today’s digital environment. Central to its governance are fiduciary obligations to ensure trustees act in the best interests of the parties creating the trust. It may borrow elements of equity law for creating an accountability framework.

Data trusts can be highly participatory, requiring systematic input from data subjects / holders, or can delegate responsibility to the trustee to determine what is beneficial for their interests. The trust’s founding charter may include mechanisms for deliberation and consultation with its beneficiaries / parties creating the trust.

Some research distinguishes public data trusts. In public data trusts, public bodies take on the role of trustees to steward citizens’ data. They may also take on a specific role in informing policymaking, public-service provision and innovation.^[83]

Data-sharing pools

Data-sharing pools are alliances among data subjects and holders that share data with the aim of improving their assets (data products, processes and services). These alliances of organisations develop around a shared purpose, context or application, and are intended to benefit all participants. Examples include the Emergent Alliance initiative which aimed to aid societal recovery post COVID-19.

Data-sharing pools are a decentralised form of data intermediary that operate under contractual law. There is no single model: specific technical, legal and organisational structures are developed to support cooperation and coordination towards the agreed purpose(s) of the pool. Specific governance mechanisms for access to and uses of data held in the pool are defined through contractual means. Competition law considerations might be built into the data sharing agreements to protect against anti-competitive behaviour.

Data-sharing pools can be highly participatory, building in participatory governance mechanisms in which members exercise an equal stake in the organisation and its management, or can delegate responsibility for decision-making.

Like data cooperatives, if a new ecosystem of data-sharing pools were created, this would create differentiation and choice for data subjects.

These mechanisms are inherently neither participatory nor inclusive, but are included here to present a full picture of the possible mechanisms for data stewardship. Personal information management systems (PIMS), data marketplaces, exchanges and industrial data platforms provide mechanisms for controlling or exchanging data that do not enable collective decision-making or require consideration of public benefit outcomes. PIMS are designed to support individuals to better manage and control their data, while marketplaces, exchanges and industrial platforms are designed to enable frictionless data sharing between organisations, normally without consideration or involvement of data subjects.

Data governance type

Definition

Governance mechanism

Participatory mechanisms

Personal information management systems (PIMS)

PIMS are an emerging data governance mechanism, through data management service providers: technologies developed to offer data subjects a means to leverage control of the processing of their data. Their aim is to provide an alternative approach to data processing by increasing the possibility for individuals to exercise their data rights and manage decisions over their data (for example, consent management decisions over data access and specific uses, data portability)

PIMS usually come under the form of an application or a dashboard with no direct governance participation from individual data subjects.

Individuals are empowered to make more granular decisions over their data and have a say over who has access, when and under what conditions. This is a form of exercising agency and autonomy over data expressed primarily through technical controls.

At the technical platform level there is usually no opportunity for participation in how the technical tools are developed, aside from potential co-design and user feedback consultations at the developers’ discretion.

Non-participatory mechanisms for data-sharing

Data marketplaces: platforms that offer intermediation as part of a value-add service that matches the supply and demand of data or data products and services. These platforms act as ‘neutral intermediaries’ in data flows as (i) they do not actively intervene in data value chains but solely facilitate the matching of supply and demand, and (ii) the data intermediation service is open to any third party that respects the terms and conditions of the intermediary and the legal framework.

Data exchanges: Operate as online data platforms where datasets can be advertised and accessed – commercially or on a not-for-profit basis.

Industrial data platforms: Provide shared infrastructure to facilitate secure data sharing and analysis between companies.

Technology platforms that exchange data under the terms and conditions of the platform.

These mechanisms are included in the taxonomy to demonstrate that there are mechanisms that enable an increasingly frictionless transfer of data without consideration for participation of data subjects, basing their right to use or process data on legal justifications like consent and legitimate interest.

Different purposes for data stewardship

As illustrated, data stewardship can take different forms, and can be more-or-less participatory or inclusive. Which mechanism is right for each context depends on the purpose, which defines the governance principles or business model, and how the mechanism seeks to operate in a broader context of ecosystem and sectoral incentive structures and interests. We have identified a variety of purposes of data stewardship within the context of the existing data ecosystem – from disrupting private-sector monopolisation to establishing or profiting from commercial relationships, to realising individual objectives of participants, rebalancing power towards data subjects, and supporting collective or public-interest outcomes.

The Data Economy Lab 2019 taxonomy identifies three broad purposes: creation of societal value, generation of commercial value and empowerment of individuals.^[84] The Open Data Institute has put forward six purposes: to empower individuals and communities; to incorporate diverse perspectives and experiences; to enhance data quality; to increase people’s control over data; to enable more effective (local) problem-solving; and to build trust and collaboration.^[85] Other scholars identify four functions: developing collective forms of data governance; protecting vulnerable populations from abuse; providing mechanisms to rebalance the powers of large platforms; and unlocking new markets for data use.^[86]

The examples demonstrate yet additional purposes: to ensure data is accurate and represents marginalised communities; to develop and improve intersectional approaches that ensure data is inclusive, by collecting data about, for example, age, gender, location or age; to disrupt structural power imbalances by improving the accuracy and representation of Indigenous languages; to improve benefits and reduce harms to specific communities; or to actively create communities. It is plain that there are many aspirations for data stewardship, and that not all are compatible within a single context or mechanism.

Purposes can be operationalised through different types of mechanisms (and these map to different participatory objectives, see Table 4. For example, a data cooperative is intrinsically participatory and will be structured with data-subject and beneficiary representation in governance, with processes designed around an agreed level of inclusion in processes and decision-making. Data trusts and data-sharing pools have the capability to be highly participatory, or to enable a more hands-off approach to stewarding or governing data, relying on agreed rules and protocols. Appreciation of context is critical: there is no single approach to defining what is ‘fit for purpose’.

Our definition of data stewardship moves the focus from who owns the data to who takes responsibility for its use, and on whose behalf. This means that the kind of organisation in a stewardship role is also relevant to purpose. Non-profit and civil society organisations are likely to operate in the interests of data subjects (for example, through data trusts). Private organisations may want to share data for commercial benefit under specific conditions (for example, through data marketplaces). And public institutions may want to share data with each other or private businesses (for example, public-sector data holders) for a number of purposes, including economic and public benefit.^[87]

Looking particularly at the UK’s and EU’s data ecosystems, there seems to be a move towards large-scale and institutionalised public-sector data sharing. The UK’s 2021 National Data Strategy put responsible and efficient data sharing and access as essential to ‘unlock the vast potential of public- and privately-held data in the United Kingdom to drive innovation, boost productivity, create new businesses and jobs, and improve public services,’^[88] and we envisage this trend will be reproduced in other jurisdictions. Developing in parallel, there is a wealth of knowledge and experience in participatory data stewardship that is currently supporting small-scale stewardship models.

In relation to these two directions, it is useful to think about two different dynamics – top-down and bottom-up approaches to data stewardship in relation to power-holding and sharing – as a way of understanding potential tensions.^[89] Through that lens, it is possible to differentiate further between different purposes for data stewardship, and how they are mobilised by different actors in the ecosystem.

Figure 2: Different and intersecting purposes of data stewardship mechanisms

In bottom-up data stewardship, we see a strong motivation towards rebalancing power towards data subjects and those affected by uses of data, redistributing power and value from private companies to data subjects of those affected by its use, and ensuring legitimacy and accountability.

In top-down models, we see purposes aligned with public-sector incentives to share and generate value from data, or having the potential to disrupt private-sector data monopolisation and enable others to enter the market. This would support the distribution of power and value accrued from data through partnerships in supply chains, and would help a broad and diverse range of actors to more equitably distribute outcomes from data use.

Within these distinctions, there is alignment over disrupting power imbalances in the ecosystem, sharing and generating value from data, stewarding data towards an objective of public benefit and building better data practices. There is increasing evidence of support for different top-down data stewardship models that aim to support consistent data practices.

Examples include public data trusts, in which a public institute aggregates and uses data about citizens, sometimes in partnership with private-sector organisations and researchers. These models have been proposed as mechanisms to inform policymaking and public-service provision.^[90] The success of these initiatives is dependent on securing public trust: Google affiliate Sidewalk Labs’s efforts to develop Toronto’s waterfront through a civic data trust was abandoned after public protest.^[91]

More recent examples include the UK Research and Innovation-funded Smart Data Research programme which is developing a range of data-sharing demonstrator models, including a data-sharing pool.^[92] This initiative has a commitment to openness, transparency, and involving the public ‘as much as possible’, and to ensuring the ‘highest standards of ethical conduct and responsible data practice’ are followed.^[93] While plans remain unclear at the time of writing, before the UK July 2024 elections, the Labour manifesto included an outline proposal for a National Data Library that would ‘bring together existing research programmes and help deliver data-driven public services, whilst maintaining strong safeguards and ensuring all of the public benefit’.^[94]

These examples bring to the forefront the critical role of the public sector in fostering data stewardship initiatives, and their potential to address known challenges. These challenges include, for example, that funding tends to incentivise institutional rather than community-led projects, or that citizens and publics are underrepresented as participants in decisions about data collection, or that private companies can be better incentivised to fund data stewardship and share data with public-sector initiatives.^[95]

The recent UK Data (Use and Access) Bill, first introduced during the 2022–23 parliamentary session, contains provisions which aim to stimulate innovation and facilitate increased use of data. For example, the draft bill includes a framework for smart data sharing, aiming to expand the open banking scheme to other sectors by facilitating the transfer of usage data between competing providers.^[96]

It also includes provisions that broaden the definition of scientific research and enable the reuse of data for any type of research that can ‘reasonably be described as scientific’, for both commercial and non-commercial activities. With compatibility being assumed between the initial purpose of collection and the new purpose, it is unclear whether fresh consent will be required for new purposes such as AI training and product development. This could potentially mean re-using data at scale in contradiction to the expectations and intentions of data subjects and perhaps even without their knowledge as there is no obligation to inform where notification requires a ‘disproportionate effort’.

While the favoured approach in the UK is towards large-scale data sharing, the EU seems to support both top-down and bottom-up approaches. For example, the EU has passed legislation supporting both large-scale initiatives aiming to create sectoral ‘data spaces’ in strategic fields,^[97] as well as legislation advancing small-scale data stewardship such as data intermediaries, including data cooperative services, and options for data donations under the concept of ‘data altruism’.^[98]

Governments are also embedding the language of data stewardship into national infrastructure. For example, in April 2024, the United Nations Economic Commission for Europe (UNECE) published a guide to encourage national statistical offices (NSOs) to recognise the need to transition to data stewardship roles. This recognises their changing role in the data ecosystem, away from the production of statistics to providers of data and data-related services, particularly to public data holders.

It reinforces the role data stewardship can have in maximising the value of data assets, for ‘public good’ and to benefit ‘the full community of data users’.^[99] This impetus is reflected around the globe: New Zealand was an early adopter, appointing a Government Chief Data Steward in 2017, and Switzerland has recently appointed a Swiss Data Steward.

These examples are representative of an approach to increasing policy and legislative support for top-down models that helps mobilise some of the intentions to rebalance the data ecosystem. In particular, this includes shifting power away from private companies and towards public institutions that collect and use data for economic and societal benefit. These new initiatives have the potential to provide reliable data governance mechanisms and better protections than private-sector organisations, as they are already subject to established legal and normative duties towards publics.

Public-sector organisations are used to complying with legal duties to inform and consult, and many use public engagement mechanisms to involve people – and have embedded practices that relate to the less participatory end of the spectrum (referred to in Figure 1). However, it remains to be seen whether they will complement or match the bottom-up initiatives by embedding processes to ensure meaningful participation in data stewardship and data-driven decision-making. It is worth noting that some practices that are needed to support healthy data ecosystems may be missed, if only some parts of the spectrum of purposes of data stewardship are adopted.

The relationship of these top-down initiatives to the implication of responsibility to the interests of others, and to the full spectrum of participatory practices in relation to these new data-sharing initiatives – from involving through to collaborating and empowering – is a subject for future research. As is the development of a communicable and transferable set of norms across the various potential purposes for data stewardship. These might include, for example, a focus on responsible management of data in a rights-enhancing way, alongside public participation and inclusion practices that recentre power asymmetries.

What is the role of legislation?

Against this background, legislation has a central role to play in ensuring a healthy data governance ecosystem. However, tailored legislation is not necessarily the only way to define the terms under which participatory and inclusive mechanisms might develop, and legislators can also use and evolve legal mechanisms to set guardrails, mobilise incentives and make data stewardship easier.

There have been legislative debates exploring the role of new data intermediaries in Europe (now adopted into the Data Governance Act), India and Canada,^[100] initiatives to embed data stewardship at a national level in New Zealand^[101] and Switzerland,^[102] and an initially promising data reform consultation in the UK which included the concept of ‘data intermediaries’.^[103]

Policymaker discussions about data governance and stewardship have not delivered specifically tailored legislation, guidance or norms for negotiating terms of data creation, access, sharing and use with the aim of rebalancing power throughout the ‘data economy’, between individual data subjects, holders and users. The variety of mechanisms available – including trusts and cooperatives – may mean there is already potential to maximise agency and value without the need for significant amendment or addition to legislation.^[104]

A healthy data governance ecosystem that supports different actors to adopt participatory and inclusive approaches can develop within legal measures. These can underpin data governance mechanisms in ways that protect rights and freedoms and support different elements of participation and of power rebalancing. It remains to be seen whether current underpinnings will be sufficient to enable this outcome.

What is the role of participatory and inclusive approaches?

Alongside legal mechanisms, participatory and inclusive approaches play a substantial role in shaping the norms, guidance, standards and best practices that are needed to develop new, communicable and transferable data stewardship mechanisms. Because legislation is not prescriptive about data stewardship mechanisms, we propose that it is these normative rules and behaviours that will – over time – support the development of healthy data governance in specific contexts and across the wider ecosystem.

Figure 3: Normative rules’ role in bridging legal mechanisms and relational practices

However, in the current data ecosystem, participatory and inclusive mechanisms for sharing and stewarding data are still emerging. These mechanisms recognise the structural barriers to data subjects being equipped to manage their own data, and have a specific purpose towards empowerment. They put data subjects’ rights, governance and participation preferences at the centre of decision-making.

Participation can take different forms, from supporting individuals to control their own personal data to collective bargaining and collaborative decision-making. In practice, this can involve highly participatory and inclusive continuous consultation, dialogues or deliberation, through to less participatory models based on governance through specific principles or terms of service that are pre-agreed with specific people or groups. And more or less weight can be given to the importance of contributing to wider public good or benefit – and the inclusion of different people in establishing the legitimacy of those efforts.

Inclusion is distinct from participation – not simply a descriptor of participation done well, or of including people from a diverse variety of socioeconomic or identity-specific groups. Inclusion also implies a process of facilitating ongoing conversations and connections between people and issues over time. This can involve building communities by recognising and engaging in multiple ways of knowing, coproducing processes and content of decision-making, and continuing to build connections over time.^[105]

Perceptions of benefits and risks of data sharing by minoritised groups – for example, blind users contributing disability-inclusive data to accessibility datasets – point to requirements for further exploration of responsibilities for data stewards to ensure rights- and privacy-preserving data management, and to limit purpose, including secondary uses and commercialisation.^[106] Direct engagement of communities and accessible approaches to planning and coordinating government services for vulnerable people may support, for example, open datasets being used to empower people beyond those who are already represented and empowered.^[107]

The UK Statistics Authority Inclusive Data Taskforce review of the UK’s approach to the collection, analysis and reporting of data and evidence provides some useful concepts. Their consultations highlighted critical data gaps in the UK and proposed recommendations for improving the inclusivity of data collection and use. Inclusion-related barriers to participation in data involved trust, lack of appropriate measures used in data collection, lack of accessibility of data collection exercises and lack of perceived personal or community benefit from participation.^[108]

Participatory mechanisms do not enforce a particular model of participation: they can be more or less participatory, depending on the wishes of data subjects and the mechanism’s conception of legitimacy in relation to public benefit. They can relate not only to data-management practices but also to decision-making processes, determining who is able to make decisions about access, control, use and benefit.^[109] A database of more than 110 initiatives shows the variety of approaches and how they can potentially drive empowerment.^[110]

Mechanisms can encompass the development of foundational governance documentation and subsequent consultation and participation in data governance. Importantly, they do this in ways that ensure alignment with agreed purposes that respect the wishes of data subjects / data holders, and (if the data steward is equipped to perform this legal role^[111]) enforces relevant rights and protections.

Different data stewardship mechanisms aim to empower people in different ways and are intrinsically more or less participatory (see Tables 1–4). Data cooperatives, data unions and data commons have participation and inclusion designed into their ways of working. Data trusts and data-sharing pools can support inclusivity and participatory empowerment, if they are specifically designed to prioritise benefit and participation of data subjects over data holders.^[112]

However, it is important to recognise that participatory mechanisms are not inherently inclusive, and can create conditions in which groups or communities feel excluded or disconnected from each other, as well as creating communities in which people feel included. Inclusive processes engage positively with differences to generate new understandings, so, for example, a deliberative process that brought together disparate knowledge and experience towards an informed judgement would be inclusive. If framed by and perpetuating the views of power-holders, a deliberative process would not be inclusive, or able to produce inclusive outcomes.^[113]

Nevertheless these participatory mechanisms have the potential to disrupt the current data ecosystem in different ways. For example, if multiple bottom-up data trusts appeared – with interoperability and portability (the ability to move trusts as required) built into the system, negotiating data use on behalf of data subjects as outlined by the trust – this would introduce choice for data subjects and competition to powerful platform ‘data intermediaries’.

Analysis of legal and participatory mechanisms

Understanding the legal underpinnings that make up data stewardship and technology governance is an important foundation towards constructing a comprehensive picture of what constitutes participatory and inclusive data stewardship today.

This section sets out important building blocks for data stewardship, how legal provisions currently ensure rights and protections, and how different levels of participation are encoded in practice.

Existing legal underpinnings for data stewardship

While there is no tailor-made legislation for establishing alternative data governance structures, there are existing legal rules and norms that offer foundational architectures for protection and participation (such as data protection legislation), as well as core principles for their constitution and establishment (such as trust law establishing data trusts, or regulation around designing data cooperatives).

The data governance mechanisms outlined in Table 1, Table 2 and Table 3 have emerged from the need to embed rights-preserving practices as laid out in legislation. They are designed to build on existing legal protections such as privacy, security and human rights. They also provide foundations for mechanisms for participation that can support data stewardship. Key considerations that will translate into any stewardship relationship will include, among others, principles of lawfulness, fairness and transparency, purpose limitation and accountability as inscribed in data protection law.^[114] There are different approaches to enabling legal and participatory mechanisms and this section will briefly discuss the directions taken by the UK and the EU.

In the UK, the 2021 data reform consultation included a discussion around data intermediaries as a responsible and innovative solution to data sharing. Further plans seem to have taken a large-scale data-sharing approach to the concept of ‘data intermediary’.^[115] The Government response to the consultation suggests that the submissions it received understood ‘data intermediaries’ in two broad categories: intermediaries that facilitate personal data exchange between parties or in a network; and intermediaries that make it easier to manage access rights to confidential data. The latter option seems to have been understood as similar to Smart Data schemes, which the government already intended to legislate.^[116] A framework for Smart Data schemes is included in the 2024 draft Data Use and Access Bill (DUA Bill).

The UK’s smart data vision seems therefore to be focused on enhanced data access, portability and interoperability. While these elements are important building blocks, they are only part of the broader purposes of data stewardship, which – as described in the mechanisms above – encompass more qualifications around how data is collected and used, and the option for more involvement in decisions over the management of data.

Other UK legislation can be seen as supporting data stewardship through regulating anti-competitive behaviour (the Digital Markets, Competition and Consumers Act 2024 (DMCC)) and through stewardship to prevent risks from illegal and harmful content online (the Online Safety Act 2023 (OSA)).^[117] The DMCC focuses on regulating digital markets, strengthening competition enforcement powers and enhancing consumer protection. The DMCC creates a new regime for regulating large businesses with a ‘position of strategic significance’ in respect to digital activities. This regime will be overseen by a specialised unit within the Competition and Markets Authority (CMA). The CMA will have new enforcement powers to combat anti-competitive behaviour and issue substantial fines for non-compliance.

From the angle of rebalancing market power and user empowerment, the DMCC provides a reform of competition and consumer law, with core objectives to make sure that users are treated fairly; are able to choose freely and easily between services or digital content; and receive information to understand the service and its terms and to make an informed decision.

From the angle of stewardship against risks from illegal and harmful content online and increasing protection for certain groups (children), the OSA introduces rules for online service providers around filtering harmful content in recommended content, adjusting content moderation practices, removing harmful materials and performing risk assessments and implementing mitigation measures.^[118]

The EU has opted for a more comprehensive picture of data stewardship which is fostered through a series of legislative mechanisms that address rights, protections, incentives and market dominance. This legislative package is discussed in the Appendix . While parallels can be drawn between European and UK legislation on online services and competition in terms of intent (the UK DMCC and the EU DMA; the UK OSA and the EU DSA), their scope and approach is different.

Architectures for protection and participation

First, the architecture of protection and participation created under the GDPR supports data stewardship in the following ways:

Direct obligations for data controllers that are designed to provide the rules and safeguards around data processing. For example, important protection mechanisms are rules that do not allow the processing of data unless certain conditions are met such as in the case of special category data (which can include sensitive information about race, ethnic, religious, political or philosophical beliefs, genetics, biometrics, health, sex life, or sexual orientation). In terms of supporting participatory mechanisms, transparency obligations require data controllers to inform about data collection and to offer explanations about the logic of processing functions such as automated decision-making. Moreover, rules on conducting impact assessments on new technologies require data controllers to consult data subjects by seeking their views.
Implicit obligations that are designed to refocus attention on people and how they are affected by data-processing operations. These can be protection mechanisms (for example, under the fairness principle, it is implicit to undergo an evaluation that starts with thoughtful considerations whether that data should be processed in the first place, whether personal data might be used in ways that people don’t reasonably expect, whether it might have unjustified consequences for groups or individuals). Implicit obligations can also include opportunities for participation through co-design and collaboration under the data protection by default and by design obligation. (Although not directly mandated, there is nothing stopping data controllers from asking for direct involvement from data subjects when designing new systems).
A rights framework that empowers individuals with more choice and control over their data. For example, the right to object to data being processed, the right to access what data is being processed about them, the right to correct and rectify information, and the right to move or port data to different services or into alternative structures.
Additional mechanisms for participation are options for representation that allow individuals to be represented by a body either through a direct mandate or independently from it (Art. 80 GDPR).

Second, there is a package of new EU digital regulation that focuses on better stewardship of data and power corrections. It does this through regulating business practices; through enhancing individual rights, transparency obligations and accountability (the DSA and the DMA); through establishing data intermediaries and mechanisms for the voluntary sharing of data (the DGA); and through facilitating use, access and portability of Internet of Things devices (the DA). See Appendix for further detail on EU legal underpinnings for data stewardship.

Core principles for the constitution and establishment of data stewardship

The foundational architectures presented (starting with data protection law and moving into specific legislation like the EU Data Governance Act) establish the basis of functioning. Alongside these, trust law and rules around designing cooperatives represent pillars for the constitution and establishment for data stewardship models.

For example, the flexibility of trust law means that its concepts can be applied to a broader understanding of what type of assets can be placed in a trust and include the exercise of data rights to form a ‘data trust’. Another important element from trust law are fiduciary duties, where the interests of individuals or groups of people are protected and cared for by a responsible person or body.

Similarly, the flexibility of the legal form for cooperatives allows members of data cooperatives to choose how to be governed and how much to contribute in order to participate.

Frameworks for participatory and inclusive data stewardship

In addition to legal protections of rights and freedoms that enable data stewardship mechanisms, there are emerging normative rules and behaviours (‘norms’) developed by civil society and policymakers that support understanding and communication of how to steward data in participatory and inclusive ways. Building on the legal underpinnings, they ground the concept of data stewardship into tangible models and options for data governance and participation.

Some of these norms build on legal mechanisms and some are developed through relational, participatory practices. Here, we do not explore all the different on-the-ground initiatives that are actively building practices and norms (see examples here). Instead – as well as reviewing significant developments in the ecosystem – we have carried out an analysis of frameworks, as the visible, synthesised encodings of emergent rules and behaviours.

This brief analysis provides an overview of their development over time, from those relating to participation in policymaking, to participation in technology development or governance, to specific frameworks relating to participatory and inclusive data stewardship. It also indicates a direction of travel, distinguishing them as conceptual, operational and – just beginning to emerge – evaluative, and summarises their significance in the inclusion, involvement and empowerment of people in decision-making about data. A full outline of each framework is provided in the Appendix.

Purposes of participation

There are three significant conceptual frameworks that support understanding of the different purposes of participation, and how those are met by different mechanisms. Together, these frameworks provide useful guidance for organisations setting up participatory mechanisms – and a test during implementation that the purposes of participation are being met by the mechanisms chosen:

US policymaker Sherry Arnstein’s 1969 ladder of participation (mapping engagement of publics from non-participation to tokenism to citizen power, to ‘encourage a more enlightened dialogue’ and explode the ‘empty ritual of participation’) is widely acknowledged as the foundation for contemporary thinking on participatory practices that connects participation to power and accountability. It sets out a theoretical framework for policymakers, for how the public can be supported to participate in governance and democracy. It has an explicit political purpose: to redefine participation as redistribution of power and support greater empowerment of people in political and economic processes, with a goal to enable them to share in the benefits of an affluent society.

The International Association for Public Participation (IAP2) spectrum sets out clear language for the purposes of five different mechanisms (informing, consulting, involving, collaborating and empowering) for policymakers and other powerholders: to consult is ‘to obtain public feedback on analysis, alternatives and/or decisions’, whereas to empower is ‘to place final decision-making in the hands of the public’. It is also significant for translating those into clearly understandable language as ‘promises’ for publics: ‘We will keep you informed, listen to and acknowledge concerns and aspirations, and provide feedback on how public input influenced the decision.’ Or in the case of empowerment: ‘We will implement what you decide.’

The Ada Lovelace Institute Participatory data stewardship framework (2021) relates the above frameworks specifically to the context, opportunities and challenges of data stewardship. The framework describes factors in detail that contribute to each of the IAP2 objectives. For example, people can be informed about what is happening to data about or affecting themselves through organisations adopting ‘meaningful transparency’ and explainability, to being consulted through community engagements and public attitudes research. Or when people are empowered, they play a significant role in making decisions about how data is governed, through continuously shaping rules, being involved in data-access mechanisms, owning or controlling mechanisms like cooperatives or otherwise deciding terms of data access or licensing. This framework pushes the concept of empowerment further, towards publics designing and developing their own data governance frameworks, with power-holders providing advice and assistance on request.

Figure 4: The spectrum of public participation purposes in relation to empowerment

Operational norms

As identified above, participatory and inclusive data stewardship mechanisms are in early stages of development. Nevertheless, some organisations have sought to analyse, understand and communicate the emerging norms, guidance, standards and best practices that will be needed to develop new, communicable and transferable data stewardship mechanisms. These organisations include civil society and non-profits, to national governments.

Mozilla’s Practical Framework for Applying Ostrom’s Principles to Data Commons Governance (2021) emphasises the need to consider the practical (sociotechnical) infrastructure around the core data governance purpose, including finances or community behaviours. Its framework applies specifically to the data commons mechanism, but the normative insights are transferable across other participatory data-sharing initiatives. The framework acknowledges the range of possible models – from revenue raising to privacy preserving, delegation of responsibility to representative voting or management.

The framework also acknowledges a ‘usability gap’ from principles to practice, that they attempt to fill with provocations, designed for new data commons initiatives to answer. Putting the answers into practice will develop the norms, rules and behaviours that shape each individual model. The range of questions demonstrates the complexity of these kinds of initiatives, and the skills and experiences required by organisers to make them work well. For example: How will you handle money and shared values? How will you value contributors’ time? How are contributors to your community forum expected to behave?

The framework identifies purpose (see conceptual frameworks above) as the primary boundary requirement: Is there a clear purpose? Is there a stated purpose for the creation of the data commons and for the collection and use of the data? Is the underlying rationale for creating the data commons clearly set out? Who does this benefit and how? It also provokes consideration of participation and governance: Where do we envision stakeholders to engage and deliberate? How are decisions about data access made? What happens when someone violates data use rules? Do rules related to data collection and production have a corresponding set of accountability measures? Are conflict-resolution mechanisms easily accessible by all stakeholders?

The New Zealand Government data stewardship framework for NZ is unique in these frameworks because it represents a national government approach to stewarding data. The framework contains seven elements: strategy and culture; rules and settings; roles, responsibilities and accountabilities; data capability and quality; people capability and literacy; influence and advocacy; monitoring and assurance.

The description of the framework implies a responsibility to specific beneficiaries: to ‘better manage and use the data it holds on behalf of New Zealanders’. That responsibility is operationalised as a top-down model, with the implication that the framework’s legitimacy and accountability rests on involvement of elected government and civil service representatives, using mechanisms of informing and consulting. In relation to involving or empowering publics or affected people, it includes co-design of a Māori data governance approach, but no evidence of more generalised participatory mechanisms to guide development of governance processes or decision-making.

The Open Data Institute What makes participatory data initiatives successful? is an early attempt to identify norms that contribute to success and sustainability through a synthesis of literature. It found that ‘participatory data initiatives are united in their conceptualisation of success as achieved through creating impact and changing the lives of the participating community,’ and that this goal structures decisions, design and outcomes. Effective project implementation and realisation of intended outcomes are currently prioritised over long-term sustainability of initiatives.

There are three conceptualisations of success in participatory initiatives: leading to measurable outcomes and change; building new evidence and knowledge that reflects the lived experiences of people and communities; and empowering communities to develop agency and resilience. Five factors for success communicate important developing norms: involving diverse representatives of communities; mobilising skills and experience; ensuring sufficient time and resources; intentionally linking design choices to outcomes; and being responsive to external factors.

The Aapti playbook Fostering participatory data stewardship is the most developed and extensive of the operational frameworks, representing a broad range of on-the-ground, transferable learnings in the form of identified challenges and strategies. It focuses on three principal enablers (or ‘plays’): the role of the public sector in ecosystem enablement, technical pathways for operationalising community participation in data stewardship and scaling community governance through data stewardship. These are all underpinned by the challenge of the need to foster effective participation.

For example, in relation to this need to foster participation, the playbook identifies a range of strategies at different scales, from identifying incentives of data generators, to creating partnerships with existing community-led initiatives, to supporting legal and technical ‘participation by design’ infrastructures, to remodelling the regulatory landscape to build on consent and data protection for personal data sharing and towards approaches that recognise the social value of data.

Inclusion

Inclusion of people in datasets is both a technical and a participatory consideration. Therefore the 2016 FAIR principles (Findable, Accessible, Interoperable and Reusable), which recognising the structuring role of computational processes in data management, are a vital underpinning for participatory and inclusive data stewardship. They are particularly relevant for the scientific and research communities, in relation to the creation and use of large datasets for public-interest innovation research. The 2020 CARE principles (Collective benefit, Authority to control, Responsibility and Ethics) build on the FAIR principles to recognise, support and rebalance power specifically towards the rights and interests of Indigenous peoples, and more generally equitable participation and outcomes in data access and use.

They ensure decisions are connected to values, not just of individual data subjects, but also groups or communities. The Joint Research Centre (European Commission) Mapping the landscape of data intermediaries takes a broader perspective in relation to inclusion, towards rebalancing power in the data ecosystem. In relation to control and agency, inclusive practices support a greater diversity in access to data and decision-making in relation to how data is shared, accessed and used, when compared to data-sharing by large corporate platforms.

The mapping recognises lower-power actors in the data economy as citizens, civil society organisations, but also SMEs and local public authorities who could be empowered to build functional data-driven services outside the power of data monopolies. In relation to value and benefit sharing, the mapping recognises multiple kinds of value generation, including economic, but also social and moral value. It stresses the importance of fair distribution of outcomes of data use, including profit sharing.

Legal and participatory mechanisms for data stewardship

Mapping legal and participatory mechanisms can make visible where there is potential for participatory objectives and mechanisms to contribute to building successful future models. Table 4 shows how legal and participatory mechanisms can intersect and where gaps remain in practice.[119]

Table 4. Mapping legal and participatory mechanisms to objectives for data stewardship

Objective, description^[120] and what people can expect^[121]

How this expectation is matched in existing regulation

How participatory mechanisms and practices build on existing regulation

Informing

Description: ‘A one way flow of information’

What people can expect: ‘We will keep you informed on how your data is being used’

Data governance type:
All

This expectation to ‘inform’ is mandated in legislation through several types of provisions.

For example, in data protection law, information requirements include an obligation for data controllers to immediately inform at the time of collection or in a reasonable time after data has been processed with a set list of transparency information (Art. 12–14 GDPR).

In the case of automated decision-making, the information obligations include ‘meaningful information about the logic involved, as well as the significance and the envisaged consequences of such processing for the data subject’ (Art. 13(2)(f) and 14(2)(g) GDPR).

Explanation needs to be shared ‘in a concise, transparent, intelligible and easily accessible form, using clear and plain language, in particular for any information addressed specifically to a child’ and free of charge. (Art. 12 (1) GDPR).

In digital services regulation, intermediary services are requested to include in their terms and conditions information about content moderation policies, algorithmic decision-making, information about recommender systems and human review procedures. Intermediary service providers also need to publish transparency reports (Art. 14, 15, 24 and 27 of the DSA).

Transparency obligations for intermediary service providers include the requirement for online platforms to clearly display to users that the information is an advertisement, along with meaningful information about the parameters that determine why the user sees the ad (Art. 26 of the DSA). Plus additional requirements to create an ad repository (Art. 39 of the DSA).

Informing can be both a distinct objective (if that is the level of involvement that data subjects have decided they want), or in a more participatory and inclusive model, it can be foundational to a more empowering objective, supporting transparency and information-sharing.

Limitations for participation: Provisions for data controllers to inform data subjects that data has been processed provides minimal obligation for data controllers to make information available at the time of collection, but in practice this is a ‘take-it-or-leave-it’ approach. Data subjects’ direct recourse is to withdraw consent or object to processing.

By supporting understanding of automated processing and automated decision-making, explainability and transparency mechanisms like data and algorithm registers can support informing,^[122] but only if they are communicated in non-technical language.

Mechanisms for participation: Community engagement such as through awareness campaigns and support centres can inform about collective as well as individual concerns and benefits.

In practice, private and public-sector data controllers would have a minimal legal requirement to notify about data use. A participatory and inclusive data stewardship mechanism like a data cooperative, because they are in a more trusted and equitable relationship with data subjects, would inform not just about data use but also ways of working, decision-making mechanisms and progress towards agreed outcomes for individuals and collectives.

Questions for participation: Is this legal basis sufficient in practice to really inform people, in an already inequitable data ecosystem?

Consulting

Description: ‘Inviting people’s opinions, through attitude surveys, neighbourhood meetings and public hearings’

What people can expect: ‘We will listen to, acknowledge concerns and aspirations, and

provide feedback on how public input influenced the data-governance framework’

Data governance type:
All

In the context of data processing involving new technologies (such as AI or automated decision-making), data controllers need to perform an impact assessment and shall seek views of data subjects or their representatives, where appropriate (see Art. 35 (9) GDPR).

In data protection law, there are regulator duties to consult ‘interested parties’ (see Art. 70 (4) GDPR).

In digital services regulation, service providers have the obligation to perform risk assessments. They need to make sure the assessment and risk mitigation is based on the best available information with the involvement of representatives of users and affected groups, independent experts and civil society organisations (Art. 34 and Recital 90).

At a local level, authorities may have duties to consult (see duties for local authorities in the UK,^[123] together with guidance^[124]).

Consulting can be both a distinct objective (if that is the level of involvement that data subjects have decided they want), or in a more participatory and inclusive model, it can be foundational to a more empowering objective, supporting understanding of the wishes of data subjects and beneficiaries throughout the lifecycle of the stewardship mechanism.

Limitations for participation: Mandating consultation with data subjects for new technologies and high-risk processing reinforces a model in which the views of data subjects are material to decision-making, however this type of direct consultation is rarely applied in practice.

In practice, there is no obligation in data protection law for the data controller to provide feedback, to explain how views were taken into account, etc. This is a one-way consultation, not a feedback loop. However, we see a mandated feedback loop in digital services regulation, where platforms are required to act on recommendations received from independent audits and take necessary measures to implement them (Art. 37 of the DSA).

Mechanisms for participation:

Surveys, focus groups and public attitudes research to understand individual views, and lived-experience panels, community engagements, network events and consultations to understand collective views.

In practice, where regulators and local authorities are mandated to consult and provide feedback, participatory data stewardship mechanisms can build consultation and accountability into all or any stages of development – from deciding on the data governance model and purpose to iterating ways of working, to appointing trustees or data stewards.

Questions for participation: How can participatory practices ensure consultation is meaningful, not box-ticking? How can these practices support accountability?

Involving

Description: ‘Allow citizens to advise, but retain for powerholders the continued right to decide’

What people can expect: ‘We will work with you to ensure your concerns and aspirations are directly reflected in data governance… we will provide feedback on how public input influenced these decisions’

Data governance type:
All data cooperatives, data commons and data unions; some models of data trusts and data sharing pools

Providers of very large online platforms or very large search engines should draw on recommendations from civil society on mitigation measures for systemic risks for electoral processes (European Commission guidelines on the DSA).^[125]

Involving can be both a distinct objective (if that is the level of involvement that data subjects have decided they want), or in a more participatory and inclusive model, it can be foundational to a more empowering objective, supporting contributions by data subjects and beneficiaries to the development of processes and decisions throughout the lifecycle of the stewardship mechanism.

Limitations for participation: Participation is often unidirectional and in lack of enforcement and penalties, powerholders can choose not to engage, respond or act upon recommendations sent from representatives of civil society organisations.

Mechanisms for participation: Participatory mechanisms to support involvement are different to consultation, and include one-off and institutionalised public deliberation and

deliberative democracy initiatives; and lived experience panels.

Involving people in developing processes and decision-making brings in a range of specific participatory mechanisms that are designed to enable people to come together, be informed about relevant considerations, deliberate on complex problems or trade-offs and reach decisions that have some community consensus and legitimacy.

Questions for participation: How to design participatory mechanisms to maximise opportunities for participants to have their wishes reflected and acted on in processes and governance?

Collaborating

Description: ‘Enables people to negotiate and engage in trade-offs with powerholders’

What people can expect: ‘We will look to you for advice and innovation in

design of data-governance

frameworks… and

incorporate your advice and recommendations

to the maximum extent possible’

Data governance type:
All data cooperatives, data commons and data unions; some models of data trusts and data sharing pools

Opportunity for collaboration for data controllers to co-design ‘data protection by design and by default’ requirements, but not an established practice (see Art. 25 GDPR).

‘Trusted flaggers’ provisions in the DSA, where the online platform provider needs to process, give priority and act on notices submitted by trusted flaggers (Art. 22 DSA).

Policies on co-production of public services (such as UK NHS^[126] and signals of community participation attempt in devolved care).^[127]

Collaborating implies a more participatory and inclusive model. It can be foundational to a more empowering objective, in which data subjects and beneficiaries contribute meaningfully to the development of processes and decisions throughout the lifecycle of the stewardship mechanism, although power is still held by the organisation.

Limitations for participation: There are some legal underpinnings for collaboration, and some models exist in the public sector, particularly healthcare, but these are not mandated or completely established in practice.

Mechanisms for participation:

There is lots of scope for participatory mechanisms to supplement legal provisions, including: public deliberation and deliberative democracy initiatives; bottom-up data governance initiatives managed by an

independent intermediary; participant panels and data donation mechanisms.

For these mechanisms to be legitimate and effective, power holders have to be prepared to create space to listen to different kinds of expertise, and change their practices in response to feedback from data subjects and beneficiaries.

Questions for participation: How to design space for equitable collaboration that respects different expertise, knowledge and skills, including lived experience?

Empowering

Description: ‘Citizens obtain the majority of decision-making seats or full managerial power’

What people can expect: ‘We will provide advice and

assistance as requested in line with your decisions for

designing/developing your own data-governance

framework’

Data governance type:
All data cooperatives, data commons and data unions; some models of data trusts and data sharing pools

Data rights and data portability empowers data subjects to 1) make decisions about their data (for example, object to processing) and 2) have access to their data, download it, and transfer it to a different service if they are not satisfied. This allows for easier pooling of data (for example in structures like data cooperatives). See data rights Art 15-18 GDPR and Article 20 GDPR (data portability rights further enhanced in Article 6 of the Digital Markets Act).

For automated decision-making, data subjects can request ‘human intervention’ or oversight over the decisions (see Art. 22 (3) GDPR).

Data donations/data altruism in Data Governance Act.

Increased data access and switching in the Data Act

Empowering necessitates a participatory and inclusive model, in which data subjects and beneficiaries make decisions about the development of processes and decisions throughout the lifecycle of the stewardship mechanism, supported by expertise and administrative processes provided by the organisation.

Limitations for participation: Legislation contains provisions for data subjects to exercise data rights, but no obligations to systematically empower people in decision-making.

Data portability, as prescribed by the GDPR, only applies to data provided by the data subject and supports individual agency, whereas – in the current data ecosystem – data about individuals is often generated without their knowledge or control, and redress requires a collective understanding and approach.

Mechanisms for participation:
There is scope for participatory mechanisms to extend legal requirements in the following ways: data governance rules are shaped and routinely reviewed by data subjects and beneficiaries; they set terms of data licensing and access, and have voting powers on governance boards of data-access initiatives; and they have ownership and/or control of intrinsically participatory mechanisms like data cooperatives.

Questions for participation: How to leverage provisions to build empowerment in the current ecosystem, for those who want meaningful participation? Is the bar to empowerment set too high for non-specialists to proactively set up empowerment mechanisms?

Findings from the landscape review

At the time of writing, there has not been another landscape mapping specifically in relation to participatory and inclusive data stewardship practices. Other mapping projects provide a view across existing data stewardship initiatives including types of participation,^[128] and insights into specific examples of data stewardship practices,^[129] such as data institutions.^[130]

Our findings

1. Many organisations researching or implementing data stewardship mechanisms are academic institutions based in the west, or working in the domain of health data

Keyword searches identified 2,037 sources of information. Once duplicate and irrelevant sources were removed, the final dataset consisted of 262 examples of literature and projects from over 250 different organisations/institutions.

Summary of features of the dataset (further detail in Appendix)

The dataset had a western bias, which is likely to be due to searches being conducted in English from the UK. Most organisations (69%) represented in the data were based in North America, the UK or other European countries (Figure 5).

Academic institutions comprised half (50%) of the organisations in the dataset, followed by non-profit organisations (15%), suggesting these institutions may be leading in developing the emerging field of participatory and inclusive data stewardship. Organisations were categorised based on the language they used to describe themselves in their mission statements or ‘about us’ pages.

Organisations identified in the dataset used language around public benefit, public good and community-centred approaches to describe their work. There were some variations in this language by sector and type of organisation. For example, the small sample of private companies in the data often used language around security and transparency, while government organisations similarly focused on transparency, but also had terms around accountability.

Many sources in the data (25% of entries) related to health data. Examples spanned voluntary data donation by members of the public, electronic health record platforms and secure research environments and repositories with health-related datasets for research and development purposes. One reason for this could be established patient and public involvement and engagement (PPIE) practices in this field.^[131]

Implemented projects – that is, activities that had been trialled or rolled out with real people – made up a slightly higher proportion of our dataset entries than theoretical work.^[132] Implemented projects ranged from national initiatives like New Zealand’s national data stewardship framework^[133] and the US National Institutes of Health Accelerating Medicines partnership^[134] to more localised projects such as Data Driven Detroit^[135] and Vision Philadelphia.^[136]

Figure 5: Map of organisations in dataset

Nearly one quarter (23%) of the formal data governance examples identified were data trusts. This was followed by data cooperatives (12%), data-sharing pools (12%) and personal information management systems (6%; Figure 6). However, categorising entries against governance structures was a challenge, particularly as many entries did not explicitly reference specific legal mechanisms.

How organisations describe themselves and their purpose

To map some of the language organisations were using in this space to describe their work around participatory and inclusive data stewardship, we extracted keywords from their mission statement or ‘About us’ pages and grouped them thematically.

The two most common themes these organisations identified as driving their purposes were public benefit or good and being a community-centred organisation. The former included referring to concepts such as ‘common good’, working towards ‘communal benefit’ and tackling ‘societal problems’ to make a positive ‘societal impact’. The latter referenced ‘community building’, offering ‘community support’ and ‘building community capacity’. Together, these themes suggest an aspiration for collective approaches towards data stewardship rather than individualised approaches.

Many also mentioned safety or security of data, providing trustworthy services, improving transparency around data, and acting as independent bodies. These themes draw attention to the role of stewardship safeguarding data in a trustworthy and unbiased way.

Equitable access to data, empowerment and inclusion were also common themes. In particular, mentions of empowerment were often accompanied by assurances around transparency of data practices and access and control over data.

2. Data trusts were the most commonly described governance structures, followed by data cooperatives

Categorising against a predetermined list of governance structures (trusts, cooperatives, data-sharing pools, etc.) was a challenge, as mentioned above. Twenty-two per cent of entries in our dataset did not reference a specific governance structure, but rather general principles or policies around governance to increase participation and/or inclusivity. This included sources exploring principles of data ownership,^[137] the FAIR principles for research data stewardship (findability, accessibility, interoperability and useability)^[138] and Indigenous data sovereignty principles.^[139] ^[140]

Other sources made reference to internal governance mechanisms rather than legal ones. For instance, Digital Earth Africa, a platform providing environmental data (for example, data on agriculture, water availability) to government and industry has a stakeholder community group within its governance structure. ^[141] Similarly, projects within the US National Institutes of Health Accelerating Medicines Partnership are overseen by steering committees for each of the disease areas studied, with each committee having representation from non-profit organisations specialising in the disease being studied.^[142]

One new term for a potentially uncategorised mechanism, ‘data collaboratives’, emerged in this research. However, it refers not to a novel mechanism but to partnerships between different sectors (including private companies, government agencies and research institutions) that use existing mechanisms to enable the exchange of data to solve public problems.^[143] Examples include a collaboration with a mobile operator, non-profit organisation and government agencies following an earthquake in Nepal, using phone data to assist with relief efforts.^[144]

Figure 6 presents an overview of the data governance structures identified.^[145] The following sections highlight some examples from the most prevalent types of data governance.

Figure 6: Data governance structures identified

Data trusts

Data trusts varied in their coverage, from regionally contained initiatives to organisations that operated at state or national levels. Examples included the Brixham Data Trust,^[146] an initiative in a small fishing town in Devon, supporting collective decision-making on the deployment of local data resources; the Born In Scotland Data Trust,^[147] an initiative around developing trustworthy data stewardship around a pilot birth cohort research study; and PLACE,^[148] a non-profit technology organisation providing mapping data from cities in the African continent for a global audience of data users. However, only 28% of data points relating to data trusts were implemented projects (and these were self-identified; we did not examine the legal bases for implemented projects). The remaining references were to theoretical literature.

Data cooperatives

Data cooperatives focused on democratising access to personal data. Examples include Open Humans,^[149] a platform for communities and individuals to explore, analyse and donate personal data and Driver’s Seat Cooperative,^[150] a platform for ride-hail drivers and delivery workers to collect, share and profit from their own data. Unlike data trusts, data cooperatives interact with business and commercial actors as well as members of the public, sometimes to leverage economic rewards for those sharing their data. As with data trusts, only 26% of data points relating to data commons were implemented projects, and the remaining references were to theoretical literature.

Data-sharing pools

Most of the examples of data-sharing pools in our dataset were in the health domain. They were often platforms that individuals could voluntarily contribute their health data to, such as the UK Biobank,^[151] Answer ALS^[152] and Our Brain Bank.^[153] These organisations frequently have some level of participant or patient oversight in their governance procedures. Outside the health domain, examples of data-sharing pools included the Agricultural Research Federation, a platform for agricultural data providers to share data for research.^[154] Within this category, more entries related to implemented projects (64%) than theoretical literature.

Personal Information Management Systems

We identified a few examples of personal information management systems. These examples tended to rely on voluntarily shared data, with the aim of protecting privacy and maximising security around personal information. Examples include the Solid Project,^[155] which allows people to store their data in secure, decentralised stores; and BitsAboutMe (which has since ceased operation),^[156] which similarly allows people to merge data from a range of digital platforms into a secure, decentralised platform. Similar to data-sharing pools, more entries in this category related to implemented projects than theoretical literature (61%).

3. Participatory mechanisms identified were varied, and the relationship between participation and power may not be as linear as previously conceptualised

To understand the types of participatory mechanisms being used and discussed in the landscape, we coded each source in relation to our 2021 Participatory Data Stewardship framework (Table 1). The mapping identified a range of sources exploring participatory data stewardship at the level of informing through to empowerment.

Most entries were coded as either collaborating with those affected by the data (27%) or empowering those affected by the data (23%). The remaining entries were evenly spread across informing (11%), consulting (16%) and involving (14%). Nine per cent of entries were coded as ambiguous or unclear in relation to where they would fall on the ladder. Figure 7 shows the prevalence of each level of participation in the dataset.

Figure 7: Levels of participation in the dataset

Applying the framework to this dataset provided some challenges. For instance, differentiating between collaboration and empowerment was not straightforward.

Our original framework positions collaboration as enabling people affected by data to engage in trade-offs with powerholders, with key mechanisms including bottom-up data governance initiatives like data trusts, participant panels, data donation mechanisms and public deliberations and democracy initiatives.

Empowerment was described to afford majority decision-making power to those affected by data, with mechanisms that allow governance rules to be shaped and reviewed by beneficiaries and those donating data, voting rights on governance boards, or ownership and / or control of data cooperatives.

Both these concepts reference representation in governance structures, and it was often difficult to judge where and when majority power was afforded to communities and individuals because they were within other institutional structures. Levels of actual power afforded to various governance mechanisms were opaque or difficult to infer from publicly available information.

There was also variation within each category. For instance, within collaboration some organisations relied on voluntarily donated data but afforded individuals with little control over how that data was used, while other models allowed for dynamic consent of voluntarily donated data and more formalised involvement in governance. The latter implies ongoing engagement while the former suggests a static relationship between data donors and data holders. These reflections suggest ways in which the framework can be refined in subsequent iterations.

The following sections further describe the sources we identified by participatory mechanism.

Informing

Open data initiatives

Most entries coded as informing were open data initiatives across various regions. They tended to focus on the utility of making civic data accessible at local levels to address community concerns and needs. An example of this includes a 2020 review of open data initiatives across Quebec, which found that these initiatives are promising in terms of increasing government transparency in that they allow civil society actors to monitor the performance of some government activities.^[157]

Building community capacity

Initiatives in this category also sought to build community capacity around data. Activities included community outreach around digital literacy and open tools for accessing and using open data,^[158] making civic data more accessible for public consumption and reuse,^[159] or working with trusted community organisations to promote engagement with civic data.^[160]

Keeping community up to date

We also identified literature that suggested organisations should have a designated data steward whose responsibilities centre around informing the public on how their data is being used. For example, the GovLab identified ‘partnership and community engagement’ as a means of informing those that are affected by insights generated from their data as one of the five key roles of a data steward.^[161]

Consulting

Community engagement activities

The landscape mapping identified a range of sources exploring consultation mechanisms as a key component of data stewardship. Literature recommended the use of community engagement to build trust between stakeholders,^[162] ^[163] particularly in relation to accessibility datasets,^[164] and to improve the quality of services that stakeholders could offer to the public.^[165] ^[166]

Examples of consultation in practice included a survey the Diabetes Research on Patient Stratification conducted with their participants to inform their post-project data governance strategy.^[167] While this method invites opinion, it does so in a static, one-way flow of information, and participants are not empowered to influence decision-making.

Involving

Embedding participatory practices

While examples of involvement also referenced the value of such practices in increasing trust between data holders and those affected by data, they tended to approach engagement in more empowering ways than examples from the previous section. For example, Digital Democracy and Data Commons, the first DECODE (Decentralised Citizen Owned Data Ecosystem) pilot in Barcelona, had a six-month participatory process to test a new technology for participatory democracy and to open a debate into data politics.^[168] And Wellcome’s framework for governing a mental health data bank also suggested engagements with lived-experience advisers and local researchers to mitigate risks around data sharing and access controls.^[169]

Collaborating

Representation in governance structures

Literature in the landscape map referenced the need for citizen representation in data governance structures,^[170] ^[171] ^[172] ^[173] ^[174] embedding ways for members of the public to contribute to the discussion of their data stewardship,^[175] ^[176] and the unique role of data intermediaries such as data trusts in facilitating collaboration with the public.^[177]

A 2023 analysis of data sharing practices and data sharing in the City of Hamburg^[178] represented some of the mechanisms described above in detail. It developed the idea that a data intermediary – that is, an entity processing data on behalf of another organisation – is necessary for ensuring data related to cities is used in service of public interest. It also argued that intermediaries must ensure that decision-making around data is participatory, and advocate for community representation in governance structures.

Large-scale databases had more formalised structures for participation and collaboration. These were often panels or steering groups comprising individuals representing the beneficiaries of the database, which oversaw how data was being used and shared, as was the case for the UK Cystic Fibrosis Registry and Public Health Scotland’s Electronic Data Research and Innovation Services, among others,^[179] ^[180] ^[181] ^[182] ^[183] ^[184] or had involvement in wider governance structures that informed decision-making within the organisation itself, as with MIDATA Cooperative and Digital Earth Africa, among others.^[185] ^[186] ^[187] ^[188]

Static versus dynamic consent

On-the-ground projects in this category were variable in terms of the level of power they afforded those affected by data decisions or those donating their data, particularly in relation to consent. For example, the Finnish Social and Health Data Permit Authority (Findata) is a data permit authority that grants permits for the secondary use of social and health data, meaning that data from social and health care services is used for purposes other than the primary reason for which it was originally collected. They compile and preprocess the pseudonymised and anonymised statistical data to ensure privacy protections and anonymity. Other models allowing for dynamic models of consent included private personal data exchanges like Datafund.^[189]

Empowering

Bottom-up governance

Entries coded in this category mentioned participatory, bottom-up governance structures such as data cooperatives, which provide space for collective decision-making while preserving individual agency over personal data.^[190] ^[191] ^[192] ^[193] Examples of data cooperatives in practice included JoinData,^[194] an online platform for farmers to share agricultural data that gives members voting powers in general assemblies, and Driver’s Seat Cooperative,^[195] a ridesharing platform owned by its members who are then able to serve on its governance board and share in profits.

Outside of data cooperatives, bottom-up and community-driven approaches to data governance were positioned as ways of redistributing power over data. In the migration and refugee space, new approaches to data stewardship that bring migrant and refugee communities together to deliberate on data related decisions and negotiate with decision-makers were proposed as a way to ensure data systems work for their benefit.^[196]

4. Mechanisms to support inclusive data stewardship were underdeveloped in the landscape review

The final section of our desk-based findings looks at the dataset through the lens of inclusive data stewardship. Here, inclusive data stewardship refers to a practice that considers all aspects of society, particularly those that are to be most impacted by data systems, in the responsible collection, maintenance, sharing and use of data.

The Aapti Institute’s Fostering Participatory Data Stewardship playbook^[197] identifies inclusion as a key challenge sectors can face when implementing participatory practices in data governance. The playbook highlights that when structural barriers are not accounted for (such as class, gender, ethnicity/race, age, citizenship status and vulnerabilities or capacities such as technical, financial and data literacy), participation can be surface-level or lack diversity. To mitigate this, the playbook highlights the value of forging community-level partnerships that empower engaged community members to build bottom-up, data-oriented communities, as well as the value of facilitating more diverse onboarding for stewarding organisations.

Ada’s 2021 Participatory data stewardship framework^[198] similarly draws out complexities in relation to inclusion and the participatory mechanisms it describes. It highlights that:

informative transparency measures can potentially fall short of enabling non-specialists to engage
consultation can be designed in a manner that excludes marginalised perspectives
the requirements of meaningful involvement may impose structural constraints on those able to participate
collaborative data donation systems can result in the underrepresentation of marginalised groups and communities
and data cooperatives, though empowering, can expect or engender high levels of active participation that risks excluding those who may wish to actively participate but find the costs onerous.

To examine our dataset from the perspective of inclusive data stewardship, we reviewed a random subset of 50 sources from the dataset, ensuring half were examples of implemented projects and the other half was theoretical literature.

Overall, we found that the dataset was more developed in terms of references to participatory data stewardship practices than in its reference to methods for promoting inclusive data stewardship. The complexities drawn out in Ada’s 2021 report still resonate with the current landscape and our analysis suggests that the relationship between participation and inclusion is not always linear.

Inclusive data stewardship mechanisms

Looking at a random subset of 50 sources from our dataset (25 implemented projects and 25 theoretical sources) we found that while participatory mechanisms promote inclusion, these mechanisms are not always sufficient in enabling inclusive participation. For instance, some voluntary data-sharing initiatives have mechanisms for participant feedback, but it is unclear how accessible these mechanisms are, and there may be a tendency for such practices to favour those with additional time and digital resources to engage (for example, BitsAboutMe).^[199] And while citizen science initiatives can challenge dominant cultures of knowledge production, more citizen representation and participation in data initiatives does not necessarily lead to equitable benefit sharing from data processes and outcomes for citizens.^[200]

One source highlighted the role of a ‘trusted intermediary’ that explores how local governments could address power imbalances in the data ecosystem.^[201] Acknowledging that often people are not able to or interested in participating directly in data governance, intermediaries can play an important role in representing the interests of a diverse public. The source specifically pointed to the role of public engagement activities in creating trust as an intermediary, further demonstrating the interlinked, but not always linear, relationship between participation and inclusion.

As we interrogated the sources through the lens of inclusion, we found that we were often coming back to ideas of power and structural inequalities. It was in this respect that movements around Indigenous data sovereignty felt like strong examples of inclusive data stewardship. Across the wider dataset there were references to Indigenous data sovereignty and data ownership debates that acknowledged the value of incorporating multiple knowledges when designing governance systems.

Specifically, within this random subset of data, we found guidance for researchers to meet Indigenous data sovereignty standards by reflecting on the ideas, values and behaviours they, as researchers, bring to research.^[202] The guidance also included examples of governance mechanisms that facilitated tribal and Indigenous perspectives to feed into research that will impact them directly or indirectly – such as health data collection initiatives such as the All of US programme^[203] and others.^[204]

Findings from interviews

We entered into interviews having completed the desk-based data review. Interviewees were presented with some headline key findings of the data review to respond to, as well as open questions to draw out their reflections of the participatory and inclusive data stewardship landscape.

1. Definitions and terms used in this field vary

Interviewees described many different definitions of data stewardship. Some interviewees defined the term as a professional role/function within an organisation while others referenced stewardship as an organisational responsibility for institutions actively holding data.

The differing models of data stewardship highlighted across the interviews are in line with the various definitions we found in our 2021 report, Exploring legal mechanisms for data stewardship,[205] suggesting the landscape has not converged in the last three years. This diversity in definitional understanding of data stewardship is likely to contribute to the complexity of the landscape as a whole.

‘Stewardship brings to mind an ongoing relationship. So not a single point in time, but some sort of relationship between the steward and the dataset and also the people who represent the dataset’.
– Jennifer Ding, Senior Researcher, The Alan Turing Institute

We also asked interviewees about their perspectives on data governance and mechanisms for stewarding data. Governance was seen as a component of data stewardship that supports decision-making around data, while participation was seen to have a more relational role.

‘When it comes to the governance aspect of participatory governance, I see the steward as helping with providing the social, digital and maybe even legal infrastructure. And then on the participatory side, I think part of the role of a steward and stewardship is to create more on-ramps and opportunities for people to get involved.’
– Jennifer Ding, Senior Researcher, The Alan Turing Institute

Many interviewees commented on how in practice, governance mechanisms tend to overlap. For example, data trusts and data cooperatives can operate in similar ways, using participatory mechanisms for decision-making. There was a view among some that distinguishing between governance mechanisms with similar models or outcomes is less important now than previously, particularly as in practice there tends to be overlap in terms of how different data institutions (for example, data trusts, data cooperatives) operate. The legal underpinnings of these mechanisms, however, remain important in determining legal responsibility and accountability.

‘The difference [between data institutions] is really a legal difference in the kind of safeguards that are in place were things to go wrong.’
– Professor Sylvie Delacroix, Inaugural Jeff Price Chair in Digital Law, King’s College London

2. Purpose matters to people’s comfort with different uses

Some interviewees talked about the significance of purpose in defining participatory and inclusive data stewardship mechanisms. There was an acknowledgement that different purposes require contributions from different people and communities.

‘How we define [data stewardship] is very different according to who you ask, and who is involved, and what the purpose is, and what the goals are, and what the institutions are, and so on.’
– Arne Hintz, Reader, Data Justice Lab, Cardiff University

A unifying theme across definitions was the motivation of public and / or community benefit purposes, and ensuring data supported this benefit from the point of collection to use / reuse.

‘There’s interest from different stakeholder and Indigenous communities that data is used for things that add value and don’t detract or have negative outcomes for them.’
– Maui Hudson, Director, Te Kotahi Research Institute

Interviewees also highlighted that the question of use and reuse in relation to purpose is complicated because, while consent is an adequate control for the purpose data is originally collected for, when data is reused it is often not possible to identify people, or to determine their comfort or discomfort with different uses or purposes.

‘Data stewardship deals with this kind of agency asymmetry, where you can establish new ways of engaging communities and individuals to be able to determine what are the kinds of purposes or uses that they would feel comfortable with, or disapprove of. Obtaining a “social licence” for reuse is a fundamental part of stewardship, and this can only be done by rethinking how you engage with individuals and communities, and that’s where participation comes in.’
– Stefaan Verhulst, Co-Founder, GovLab and the Data Tank

3. Participation takes many shapes and forms in the data lifecycle

Participation was described as a fluid rather than static concept. We explored with interviewees their experiences with, awareness of, and views on participatory mechanisms to support data stewardship. Some interviewees said that it was important to distribute participatory mechanisms across what they described as the ‘data lifecycle’, and suggested ways of doing this, from participatory agenda-setting prior to the collection of data through to participatory governance mechanisms at the point of storage, use and reuse of data. Importance was placed on having a range of options for participation across the data lifecycle that had a meaningful impact on stewardship decisions.

‘[There is a] need for multiple different modes of engagement and involvement at multiple different points in the life cycle [and a need for] public participation to be to be distributed throughout the entire system […] sometimes it’s people on boards, sometimes it’s how to commission particular one-off kinds of exercises, sometimes it’s having specific panels, sometimes it’s doing a survey. […].’
– Jeni Tennison, Executive Director, Connected by Data

However, there were concerns around ‘rubber stamp’ participation, which does not meaningfully afford decision-making power to participants. Interviewees felt it was important to unpack both where participation is happening in the data lifecycle as well the level of power afforded. The effect of participatory practices was also a concern – for example, the power dynamics between those facilitating a participatory activity and those taking part. Inequalities in this power dynamic, if left unchecked or unacknowledged, can reduce participants’ agency.

‘In these initiatives we’ve seen an unequal balance of power between those that are assembling initiatives and others who participate […] good participation allows participants to change the goal posts of debate.’
– Arne Hintz, Reader, Data Justice Lab, Cardiff University

Bottom-up initiatives in particular were seen to offer opportunities to address asymmetries in power. Here, bottom-up refers to mobilisation by affected or interested people around common concerns or issues – often, but not always, at the grassroots level. Often issues-based (for example, a group of people gathering around a shared data issue/topic) or identity-based (for example, regional movements), these initiatives tend to consider ideas around collective benefit rather than individual data rights.

Aspirations for the balance between grassroots and institutional activity, and conditions for the latter, were underexplored in our interviews. It was suggested that top-down initiatives do have a place in the landscape, particularly when data is related to sensitive topics (for example, mental health) or potentially vulnerable populations (for example, refugees). Here, it would not necessarily be appropriate to expect appropriate safeguarding from on-the-ground participation or mobilisation, and instead a trusted intermediary may be well placed to represent the needs and expectations of these groups.

‘So one of the other things that we’ve been looking at is the bottom-up demands for power within [data] systems and how we can facilitate bottom-up demands, and often that comes from movement building, and a building of collective power within those communities who can then demand a say rather than being beneficently granted a say.’
– Tim Davies, Director of Research and Practice, Connected by Data

‘Bottom-up movements cut out white tape [a term used intentionally in place of ‘red tape’] in government in that they generate high quality data with impacts that are actually meaningful for the communities they serve.’
– Gwen Phillips, Ktunaxa Nation

4. Mechanisms for inclusion are less developed than mechanisms around participation

While there was a recognition that inclusion was an important aspect of participation, a number of challenges around implementing inclusive data stewardship were raised. These included challenges around who should be included, how to incentivise them to participate and what inclusive participation looks like.

Those most affected by a data system were often cited as the groups to involve in participatory practices. Some interviewees noted that these groups of people can often be underrepresented in participatory practices, particularly when methods strive for a representative split of different groups rather than an in-depth focus on specific groups of people. As one participant expressed, there can be ‘majority–minority tensions’ when participatory practices, and their subsequent outcomes, focus only on the impacts on underrepresented populations, as stakeholders in power can be dismissive of the legitimacy of this evidence.

Data literacy, community capacity and lack of incentives to participate were all raised as challenges in inclusive practice. Some interviewees highlighted that a lack of data literacy, coupled with limitations on community capacity to engage and/or poor incentives, can mean that the voices of those that may already be most underrepresented in the discourse are continuously missing from participatory practices. The challenge is heightened by the unattractiveness of data rights as an area for concern, often taking a lower priority than other issues in the day-to-day lives of many.

‘Inclusive entails that we are looking out for the people who are currently arguably excluded. My first thought would be the elephant in the room in this conversation is most people still don’t really have any awareness of the fact that they leak data on a daily basis, that they have rights, that they can leverage those rights to achieve a number of things.’
– Professor Sylvie Delacroix, Inaugural Jeff Price Chair in Digital Law, King’s College London

Communities also need to have capacity and incentive to engage for data stewardship to be inclusive. Interviewees reflected that self-elected involvement in participatory activities has its limitations. Those most empowered to participate will likely be overrepresented, while those facing barriers in terms of their own capacity and the associated incentives will be underrepresented. It was noted that distribution of economic benefits of participation and data stewardship should not be left out of the discourse as this is a driving force for better inclusion in practice.

‘We did some work on data trusts with the aim of articulating what they might look like in three different examples, and one of them was in small shareholder farming in India. We spoke to some farmers and they [asked us] “why would I do this if it’s not directly going to give me access to credit or increase my yields?”’
– Joe Massey, Senior Researcher, Open Data Institute

5. Trusted intermediaries are a critical component of the ecosystem

Trusted intermediaries were often suggested as bridges in the participation-inclusion gap identified in the landscape mapping. Trusted intermediaries could include community leaders or advocates acting on behalf of a community, civil society organisations representing community interests, or other legal mechanisms within a data institution such as a data trust that lent power to a body representing the views of a wider community.

There was a suggestion across some interviews that inclusion does not mean that everybody has to be included in everything, which is where trusted intermediaries fit into the landscape. This view was held when thinking through issues around capacity to engage in data stewardship practices, as well in instances where participation can exacerbate existing vulnerabilities, as previously mentioned in this review.

‘There’s almost this implicit assumption that people know about the value of data [but] most people don’t […] so I think in most cases, you definitely need an intermediary who’s doing all of this work.’
– Vinay Narayan, Senior Manager, Aapti Institute

The role of trusted intermediaries is linked to additional conversations on participatory and inclusive data stewardship processes versus outcomes. Some interviewees suggested that processes can be participatory and/or inclusive without leading to more favourable outcomes for affected groups of society, while others can lead to favourable outcomes without being participatory and/or inclusive in process. When the latter is true, trusted intermediaries could act to facilitate inclusive outcomes, despite the lack of participation or inclusion of representatives from affected communities in formal processes.

‘I think with both participation and inclusion there are processes and there’s outcome, particularly on inclusion. So we can have an inclusive process with affected stakeholders being part of the decision-making, but there is also the question of whether we have inclusive outcomes where the distribution of the value, benefits or potential of data are more substantively inclusive, and not violating equality concerns. I think in the case of process, participatory and inclusive should be aligned. But it is also possible from an inclusion point of view to have a good process but a poor outcome, or poor process and good outcomes: those two things can be separate.’
– Tim Davies, Director of Research and Practice, Connected by Data

However, in our interviews, we did not fully explore what makes a trusted intermediary. We can infer that legal responsibility may be one marker, particularly when concerning legal mechanisms such as data trusts, but it is likely that other metrics are also important – for example, transparency in practices, the sharing of values, or a record of facilitating positive changes for the communities the trusted intermediary seeks to represent. Unpacking what makes a trusted intermediary may help further understand how to support inclusive data stewardship.

‘These are intermediaries who are trusted by the community for various reasons, because of the role they play for that community.’
– Vinay Narayan, Senior Manager, Aapti Institute

6. Progress has been made in some areas of data stewardship

EU legislation such as the GDPR were cited as milestones of progress in the data stewardship landscape. However, there was also a view that legislative progress in this space is limited.

‘GDPR primarily supports the ability for your data not to be processed. But I think what we’re finding is when it comes to governance […] there’s so much more that [people] would actually be interested in.’
– Jennifer Ding, Senior Researcher, The Alan Turing Institute

‘[I think] we actually have missed a few opportunities to advance data stewardship. One missed opportunity for instance has been the European Data Act. One of the early developments of the Act was a requirement or expectation that if you are a large organisation that has data that could be beneficial to society that you should have some kind of achieved data steward [role or function in the organisation]. That got dropped by the European Commission.’
– Stefaan Verhulst, Co-Founder, GovLab and the Data Tank

Some interviewees felt that the Indigenous data sovereignty (described as the recognition of Indigenous peoples as rights holders in relation to Indigenous data) movement was an area of growing development.

Gwen Phillips, who is from the Ktunaxa Nation in Canada, felt that some progress has been made since 2013 – a point in time when First Nation communities were often ‘mis-named, misidentified’ in policy. In 2021 the First Nations Information Governance Centre has been funded to work on developing a national data strategy with an ambition for setting up networks of data centres with participatory action movements and ground-up data collection that strives for ‘distinct reportability’ – data that is related to the distinct needs of distinct First Nations against self-determined definitions of wellbeing. The Centre’s vision is that each distinct First Nation achieves data sovereignty in alignment with their own worldview.

More generally, interviewees felt that principles such as the CARE principles (Collective benefit, Authority to control, Responsibility, and Ethics) have helped push the dial forward in this discourse. However, adoption of these principles is still in its early stages.

‘It’s early days, but there is a general acceptance that the CARE principles are part of a suite of principles that should be thought about and used. What it looks like in practice is still evolving.’ – Maui Hudson, Director, Te Kotahi Research Institute

Growing excitement around participatory methods, as well as a more developed discourse around data stewardship was suggested as another area of progress in the landscape. For example, participants described how participatory practices tend to be encouraged in research applications, with potentially more funding opportunities available for this type of work.

‘I think it was super interesting to see the announcement of the Data Empowerment Fund and the massive reaction it got. It shows that there is actually an underlying awareness and a desire to experiment with new ways of participation called stewardship in the area of data.’
– Stefaan Verhulst, Co-Founder, GovLab and the Data Tank

One interviewee felt that the case for participatory and inclusive data governance in the UK has been well established, and resources are now needed to support institutions in embedding these practices. But optimism around the recognition of participatory methods was tempered with concerns of ‘participation washing’ and engaging with these debates based on optics rather than values or missions.

‘Within the UK, I think there’s much more buy-in and interest and investment in these kinds of participatory governance activities both in the public sector and private sector.’
– Jennifer Ding, Senior Researcher, The Alan Turing Institute

‘One of the things that concerns me is that the more that this [participatory stewardship] is done badly, [the more] participation looks like participation washing. It doesn’t actually make any change in the world. The more that it is done badly the more feelings of exploitation build up and the harder it is to do well in the future.’
– Jeni Tennison, Executive Director, Connected by Data

7. But barriers have slowed down progress in some areas of the landscape

Looking across the landscape, several interviewees felt that sustainability was a key issue. Well-cited examples like Driver’s Seat (as described in the above review) have ‘closed shop’, with interviewees suggesting a range of potential barriers for sustained activity.

Funding challenges: many stewardship efforts do not generate monetary value and therefore rely on external funding and grants. Within this, there is the added challenge of documenting impact in a short period of time to justify additional funding, particularly as the field is emerging and impact is not always immediate. Some interviewees mentioned public–private partnerships as a route to generate income, but there was a sense that this is not common practice.

‘[There is a] deep economic challenge of data stewardship, which is that sustaining this work through business models often struggles to steward data well, particularly in inclusive and participatory ways.’
– Tim Davies, Director of Research and Practice, Connected by Data

‘[As we’ve heard from another organisation] the cost of a technological product alongside the stewarding data part [of an organisation] is actually really capital intensive and expensive.’
– Joe Massey, Senior Researcher, the Open Data Institute

Capacity to participate: As described earlier in this review, capacity to engage in initiatives was cited as a key barrier in the landscape that may contribute to the issue of sustainability. Capacity includes having time to participate, enough data literacy to be involved meaningfully and access to appropriate resources and support to sustain involvement.

‘Maybe participation requires a certain amount of free time or a background in the topics at hand, but that automatically will just cut down on who can get involved. So I think it’s really important to consider how to lower the barriers for people to get involved and I think that comes through good activity design, prior relationships and engagement with the community’.
– Jennifer Ding, Senior Researcher, The Alan Turing Institute

Incentive challenge: Related to the barrier of capacity is the issue of incentives to participate. It was felt by most interviewees that incentives to participate in participatory data initiatives are not always clear to those involved, nor do they always resonate with communities.

‘I don’t know how many people are knocking on the doors of these places [data repositories] and saying “we need you to have data governance in place”. The governance of the data is an enabler. And the thing that motivates most people is the actioning [using data to do something]. It’s not just a question of expertise. It’s also a question of motivation.’
– Maui Hudson, Director, Te Kotahi Research Institute

In addition to sustainability, interviewees felt a barrier in the landscape was a lack of institutionalisation of participation and inclusion in governance practices. Interviewees expressed a frustration around participatory and inclusive stewardship being an optional practice as this made building cases for the embedding of the practices more difficult. Some sectors were cited as being more developed in their institutionalisation of participation (for example, health sector) while in others this was seen as more of an emergent field. The lack of institutionalisation may also have implications for trust that feed into the other barriers identified; if participation and inclusion does not lead to meaningful change, people may feel disincentivised to participate.

‘So right now participatory stewardship is an option that is granted by some but not required. So how do we embed this? It’s very hard to do that within our common dominant narratives of rapid innovation.’
– Tim Davies, Director of Research and Practice, Connected by Data

‘The health sector is really leading the way on that, given that PPIE [patient and public involvement and engagement] is a recognised and fully funded activity and profession, more so than in other branches of research where that professionalisation is still emerging.’
– Jennifer Ding, Senior Researcher, The Alan Turing Institute

8. Discourses are shifting from data to AI

Finally, when exploring where the landscape is now and where it may be heading, some participants expressed concern that heightened interest in AI is shifting attention away from data. There were suggestions that in the last few years attempts to create stewardship spaces have struggled due to the hype of AI, raising questions around whether the move towards AI will progress the data stewardship debate or eclipse it.

There was a suggestion that narratives around data stewardship may have been disrupted by emerging AI technologies. While the landscape had previously established the importance of stewarding data, emerging technologies like foundation models bring new challenges.

‘We are actually entering a data winter while we are having an AI summer.’
– Stefaan Verhulst, Co-Founder, GovLab and the Data Tank

However, data and AI remain intrinsically linked. One interviewee reiterated the value of established debates within the data stewardship landscape when thinking about the complexities of emerging technologies such as large language models. Using principles such as CARE, stewardship of data can still be explored in the context of AI technologies that are of general purpose, rather than designed specifically for one use or application, as demonstrated below:

‘I think interest from different stakeholders and Indigenous communities is that the data is used for things that add value or don’t have negative outcomes for them. And that they can access and use it for their own purposes so there is a future benefit aspect. When data gets sucked up into GenAI models and it removes all the provenance data, which of those possibilities is the AI model removing? Is it still maintaining the quality of the data, is it still maintaining access to the data, is it still maintaining opportunities for benefit?’
– Maui Hudson, Director, Te Kotahi Research Institute

Conclusion

This landscape review aimed to bring together disparate understandings of participatory and inclusive data stewardship in theory and in practice, through an international lens.

It reveals a participatory data community that gives a central role to the voices of people and communities in shaping data practices through ‘bottom-up’ initiatives, and is heavily invested – in theory and in practice – in understanding people’s interests, increasing individual empowerment and benefiting the public. From their perspective, the legal mechanisms that support rights and protections are a foundation for enabling conditions to achieve different degrees of empowerment.

In parallel with this is an emerging collection of public-sector or private–public partnership organisations that have a different interpretation of stewardship, and aim to maximise the economic and societal value of data through top-down initiatives. They use the language of stewardship to indicate broad intentions towards good data practices (often without a model of data stewardship that encompasses responsibility to others’ interests). From their perspective participatory practices are contributing to enabling building better data practices and ensuring public benefit.

The review makes apparent that a holistic view of participatory and inclusive data stewardship mechanisms and practices is needed, extending beyond data governance to include all the practical (sociotechnical) infrastructure around the data, including organisational functions or community behaviours. In addition, bridging between the various knowledges and practices (for example, participation and inclusion; legal and participatory mechanisms and practices; and top-down and bottom-up purposes) will be necessary to realise the potential of these mechanisms to contribute to a healthy data ecosystem.

We do not suggest that participatory and inclusive data stewardship alone can remove the inequities in the current data economy, but we do propose that it has the potential to contribute towards positive change, for example disrupting private-sector monopolisation, rebalancing power towards data subjects and those affected by data sharing and use, and supporting collective or public-interest outcomes.

In particular, the review highlights that:

There is no single model for ‘good’ participatory and inclusive data stewardship, but a baseline would include: consideration and use of legal mechanisms to ensure rights and protections, as well as participatory mechanisms appropriate to the purpose and context, with responsibility and accountability to the requirements of data subjects, holders, beneficiaries and wider publics or affected people.

Conversely ‘bad’ data stewardship (or ‘data stewardship washing’) would be characterised by: a lack of assumed responsibility or accountability in relation to the requirements of data subjects, holders, beneficiaries and wider publics or affected people. This would risk undermining the legitimacy and trustworthiness of the initiative, and broader public trust in data sharing.

Next steps

More work is needed to identify and assess existing examples of practice, and for powerholders in policy and industry to enable conditions in which the value of participatory and inclusive data stewardship can be meaningfully assessed. This includes bottom up and top-down initiatives located in a range of sectors, geographies and demographics; knowledge exchange; and opportunities for experimentation and learning. In particular:

more investment is needed to bridge from theoretical work on mechanisms and community initiatives into practice at different scales, including pilot projects of smaller- and larger-scale participatory mechanisms for data stewardship
more time is needed to develop evidence about the effectiveness of participatory and inclusive mechanisms, and whether they would deliver value if adopted and integrated systematically into procedural data infrastructure and decision-making
to be effective, investment in research and practice should extend beyond participatory and inclusive data governance, to the practical (sociotechnical) infrastructure that needs to happen around the data, including – for example – finances or community behaviours.

What can we say about participatory and inclusive data stewardship?

As this research demonstrates, the term ‘data stewardship’ is widely used, but with a variety of definitions, and understandings of mechanisms and purposes. Mechanisms encompass a variety of relationships (some underpinned by legal protections) in which data 1) originates with or is considered to be ‘owned’ by an individual or group, and 2) is governed responsibly – and in the interests of those people – by an individual or organisation. Purposes can include realising individual objectives of participants, establishing or profiting from commercial relationships, and supporting collective or public-interest outcomes.

When defining participatory data stewardship, the Ada Lovelace Institute foregrounded the conditions ‘informed by values and engaging with questions of fairness’.^[206] We recognise now that the typology we developed in 2021 has some limitations, and this definition does not go far enough in exploring the complexity of empowerment relationships, or clarifying concepts and roles.

Arnstein herself commented that her ladder of participation does not include an analysis of structural inequalities that affect participation,^[207] such as racism and resistance to power redistribution. Similarly, our 2021 framework does not explicitly call attention to ways of stewarding data in a participatory and inclusive way, or detailed mechanisms to recognise and redistribute entrenched power in established data systems. This review presents an opportunity to revisit our framework and propose alternative measures.

Participatory data stewardship focuses on the meaningful involvement and empowerment of people. In participatory data stewardship theory, some scholars favour ‘bottom-up’ data stewardship as a model that centres protecting the rights and interests of data subjects, where people define their own conditions for data sharing. Other organisations are exploring top-down data stewardship with the purpose of maximising the value of personal and non-personal data, predominantly with the interests of the scientific community and public in mind.^[208] Both these approaches have different implications for participation and inclusion.

However, our research demonstrates that thinking about inclusive data stewardship, and what that may look like in practice, is even less cohesively established. This may be, in part, because engagement with questions of fairness during meaningful participation requires engagement with principles and questions of inclusion. But inclusion is distinct and requires consideration not just in participatory terms (that people are involved in decision-making), but – with attention to the context and specificity of the situation – which people are involved, and how.

These findings indicate that inclusive data stewardship primarily involves attention to who is able to meaningfully participate in the stewardship of their data, with a secondary focus on how this participation brings personal and community benefit. Inclusive data stewardship can enable multiple uses of data by different users with a range of purposes, which can generate different kinds of economic and social value. This could contribute to redistributing the benefits generated from data and breaking down data silos.^[209] If we also see inclusion as a set of practices that bring not only a broad range of people, but also issues, sectors, perspectives and engagement, into a participatory context,^[210] then we may also see complementary roles for inclusion and participation.

The evaluative sources identified in this research highlight that inclusive practices require sensitivity to context. For example, the Open Data Institute and Aapti Institute discuss how, in relation to co-designing data trusts for climate action,^[211] it may be more appropriate for civil society organisations representing the communities they serve to set up stewarding initiatives than the community members themselves. They argue that bottom-up institutions presuppose a level of digital literacy and capacity to engage on data rights (that is, these institutions may exclude those without strong digital literacies or personal capacities to engage on data matters) and so it is not always appropriate to expect community members to do this themselves.

This insight draws out a slight tension in our participatory data stewardship framework. Our framework considers direct participation and control from community members as the highest level of empowerment. As such, a completely bottom-up organisation would be considered more empowering and participatory than one that is set up and governed by institutions representing community interests. However, the reverse may be true when considering inclusive data stewardship, acknowledging that everyone might legitimately not have the interest or capacity to participate directly in data governance.

In addition, context is important. An analysis of applying data trusts in an African context draws attention to the limitations of expecting governance structures to be transferable from one context to another.^[212] Structural inequalities and power asymmetries such as those resulting from the legacies of colonialism complicate the extent to which data initiatives can be considered inclusive across all contexts and for all people.^[213] Moreover, the knowledges these initiatives draw from are important to consider. In relation to ideas around communal data ownership, researchers argue that a core issue in the discourse is a narrow definition of property rights that draws largely from Western definitions of ownership, which encounters tension and contention in other non-Western settings.^[214]

Overall, the findings suggest that the participatory and inclusive data stewardship landscape is currently underdeveloped, and the promise of specific data institutions such as data trusts and data cooperatives to deliver benefits for individuals and the wider ecosystem have not yet materialised. Interviewees believe that there are some areas of progress and indications that a rich and diverse stewardship landscape that facilitates bottom-up and top-down initiatives, knowledge exchange and opportunities for experimentation and learning would deliver data ecosystem benefits and contribute to building a healthy data ecosystem.

Observations about the landscape

Exploring the current landscape proved to be a complex task. It involved examining the levels and areas in which data stewardship can or should operate, including: enabling bottom-up empowerment of people and communities in the creation, access, use and sharing of data; or involving people in top-down governance of data through public-sector organisations. The methods we used have surfaced some insights, but further research is needed to really understand aspects that we identify.

Importantly, we identified that engaging with ideas about data stewardship moves the focus from who owns the data to who takes responsibility for its use, and on whose behalf. This lens enables us to identify the purposes of data stewardship – from disrupting private-sector monopolisation to rebalancing power towards data subjects – and how those purposes have been operationalised by different actors through the use of different mechanisms.

In reporting from the participation community there was an assumption that data stewardship is a useful and necessary component of the data ecosystem, and that participatory and inclusive practices would be beneficial. Bottom-up initiatives were seen to offer opportunities to address asymmetries in power, while institutional stewards or trusted intermediaries were seen to have a safeguarding role where data is related to sensitive topics (for example, mental health) or potentially vulnerable populations (for example, refugees).

There is optimism that emerging mechanisms and pilots would demonstrate value, alongside expressions of frustration where momentum was not sustained, or other barriers impacted progress. Respondents identified funding, incentives and constraints on people’s capacity to participate as important challenges for embedding participatory and inclusive stewardship practices.

The deep expertise and knowledge in participatory mechanisms and practices risks remaining siloed in bottom-up initiatives as the political will to mobilise value from data accelerates large-scale data-sharing initiatives. This brings with it risks of ‘participation washing’, where people are involved in consultations but – due to inadequate use of mechanisms, unintentionally or by design – their views do not have an effect on outcomes. In addition, if initiatives using smart or non-personal data shift definitions of stewardship away from its traditional meaning – implying a responsibility towards others – and towards a notion of good data practices that is not rooted in legitimacy by representing the views or wishes of data subjects or those affected by data use, there is a risk of a new phenomenon: ‘data stewardship washing’.

Our research found more literature around participation than inclusion, and interviewees also suggested developments in understanding and mechanisms for supporting participation in data stewardship are further ahead than mechanisms for inclusion. The latter was seen as a more complex challenge, with recognition of potential tensions between the two concepts even though the terms were sometimes discussed interchangeably. In relation to inclusion, trusted intermediaries were often explored as important stakeholders in the landscape.

AI in the discourse around data stewardship emerged as a new trend, and – while it is still unclear how this might change the shape and form of the landscape – if it does fulfil even some of its promise to deliver economies of resource and growth in public and private sectors, managing the data on which systems are trained and operate will be critical. While some interviewees expressed pessimism that AI would eclipse important discussions around data, others felt it was appropriate to use concepts already well-established in the stewardship space around data rights and use and translate these over to emerging technologies in an additive way.

Overall, our review found that there are existing legal mechanisms and some specific participatory and inclusive practices that support and enable data stewardship mechanisms. It may not be necessary to produce specific legislation: the research in ‘Data stewardship in practice’ indicates that there are a large number of legally underpinned inclusive and participatory pilots – including novel approaches. But it seems likely that, without systemic incentives to shift existing power structures and enable space for a new ecosystem to develop, these initiatives will not have the conditions to flourish long-term.

In a mature ecosystem, these underpinnings would be bridged by communicable and transferable norms and social rules that support pilots, innovative models and sharing of knowledge. In the short time since data stewardship mechanisms have been proposed, these have not yet had time to develop. More time and investment is needed before it is possible to evaluate whether data stewardship can make a meaningful contribution to a fairer, more equitable data ecosystem.

It is too early to judge whether participatory and inclusive data stewardship mechanisms can produce positive outcomes for data subjects, holders, users and the wider affected population. What is clear is that the current direction of travel in the data ecosystem prioritises economic and commercial benefits, growth, and potential public benefit over public involvement and empowerment in decision-making.

Interviewees made many suggestions in relation to their visions for the future participatory and inclusive data stewardship landscape, and some have surfaced through the research period. In particular, we identify the following areas for further research:

deep research into practical examples of participatory and inclusive data stewardship, to provide an international, on-the-ground view of the opportunities and challenges in the current ecosystem
greater understanding of structural and systemic considerations, including incentives for participation beyond monetary compensation, and barriers to participation and how they might be overcome
greater understanding of the incentives and challenges faced by policymakers and regulators in different jurisdictions, and their views on the importance of distributing power and value accrued from data (and rebalancing power towards data subjects and those affected by its use)
more research into private companies’ practices in relation to data collection, sharing and use, as well as their views on potential barriers to and incentives for participatory mechanisms and approaches
building an environment of knowledge sharing that has accessible tools and resources for initiatives of all sizes (from grassroots organisations to large-scale data-sharing initiatives) to draw on and learn from, to support the creation of communicable and transferable norms
further experimentation and innovations of participatory methods, including support for trials and pilot projects of both bottom-up and top-down initiatives, and documentation for shared learning.
More research building on this foundational review, for further investigation into systematic / institutionalised involvement of people in decision-making about data.

Methodology

This review focuses on social and process-related organisation rather than the technical considerations relating to data stewardship and is inherently sociotechnical in its approach. It understands mechanisms for data stewardship as operating within a wider ecosystem of societal, economic and relational dynamics that extends beyond government and regulators to include public and private-sector organisations, civil society and publics.

It begins with three exploratory questions, exploring the current state of the participatory data stewardship ecosystem, as a way of understanding existing models rather than theoretical ideals:

What are the conditions across the ecosystem of participatory and inclusive data stewardship?
Where are there distinctions across organisations working on participatory and inclusive data stewardship?
How effective are various mechanisms for participatory and inclusive data stewardship?

This review focuses specifically on the first two research questions of the overall programme. It has deployed a multi-methods approach to both envision and look across the ecosystem of participatory data stewardship, and to drill down to identify its particularities. The final question was considered out of scope for this review. To answer the question of effectiveness in this field, an analysis of evaluative materials, specific case studies and more detailed interview methods would be needed.

Legal and participatory analysis

The work on legal and participatory mechanisms emerged from understanding the need to locate evidence about current practices, identified through desk-based research and interviews, in an analysis of preconditions or underlying factors in the wider data and AI ecosystem.

The legal analysis relies on the interpretation of primary sources such as legislation, and authoritative secondary sources such as official institutional releases about issued legislation. This contribution synthesises and applies domain expertise to provide an informed commentary and juxtaposes theoretical frameworks on participation with relevant legal provisions.

The participatory analysis was informed by a desk-based review and a legal analysis. In relation to expertise on participatory and inclusive data stewardship in practice-based community and civil society organisations and academia, it reviewed literature identified through the desk-based review of frameworks and evaluations, participation and inclusion. In relation to the legal analysis, it reviewed literature including recent developments in top-down data stewardship, including large-scale data-sharing initiatives.

Desk-based database review

We used specific keywords across Google Scholar, ArXiv, Google Browser as well as internal networks/snowballing from sources to identify organisations, institutions and communities currently interested in participatory and/or inclusive data stewardship. Results were limited to pieces of work from 2018 onwards, or initiatives that predate 2018 who were still active from 2018 onwards. The desk-based data collection took place between December 2023 and April 2024.

After internal review and consultation, the following keywords were used across the databases searched:

“Data stewardship” AND (“community” OR “people” OR “publics” OR “participatory” OR “inclusive”)
“Data trust” AND (“community” OR “people” OR “publics” OR “participatory” OR “inclusive”)
“Data cooperatives” AND (“community” OR “people” OR “publics” OR “participatory” OR “inclusive”)
“Data intermediaries” AND (“community” OR “people” OR “publics” OR “participatory” OR “inclusive”)
“Data governance” AND (“community” OR “people” OR “publics” OR “participatory” OR “inclusive”)
Data commons AND (“community” OR “people” OR “publics” OR “participatory” OR “inclusive”)
Community based data sharing

The first 10 pages of results from each database were reviewed in relation to each keyword. Results were excluded from the final dataset for analysis if they did not explicitly reference or comment on involving or including people and communities in how data is collected, stored, governed or reused.

To answer our research questions, we identified the following parameters:

organisational demographics such as geography, organisation type, organisation mission
area of work/applications of data described. This included categories such as health data and environmental data.
participatory mechanisms in place or described
data governance structures in place or described
if academic literature, abstract/summary, key words, type of literature
whether the source was in reference to an implemented project or a theoretical exploration in the field.

Participatory mechanisms were codified using the participatory data stewardship framework the Ada Lovelace Institute proposed in 2021 (Table 1).^[215]

The following options were pre-defined for the coding of data governance structures:

personal information management systems
data cooperatives
data trust
data unions
data marketplaces
data-sharing pools
data exchanges
industrial data platforms
data custodians
trusted third parties.

Definitions for personal information management systems through to data-sharing pools were drawn from the European Commission’s Joint Research Centre’s map of data intermediaries,^[216] while the remaining definitions were drawn from the UK Government Centre for Data Ethics and Innovation (now Responsible Technology Adoption Unit) independent report on data intermediaries.^[217] Table 2 in the appendix provides summaries of each structure.

Interviews

We conducted one-hour, online, semi-structured interviews with 10 people working on topics related to participatory and inclusive data stewardship. They were selected due to the prevalence of their work in the desk-based database review and recommendations from networks in this research field. Interviews were conducted from June to July 2024. Interviews explored initial reflections on findings from our desk research, interviewees’ perspectives and descriptions of the participatory and inclusive data stewardship landscape and thoughts on both participatory and inclusive governance mechanisms.

Interviews covered the following topics:

definitions and language used to describe data stewardship, data governance, participation and inclusion
reflections on our desk-research findings
reflections on participatory data stewardship in practice and inclusive data stewardship
overarching description of the landscape.

Interviews were recorded, transcribed and thematically coded for key themes and insights.

Methodological limitations

This review attempts to take a broad view across a complex and changing landscape. We recognise that the research approach has inherent limitations, and that what we present here can only be a snapshot in time from a limited perspective. In particular:

The landscape review was conducted using key terms in English, from a UK browser. This limits opportunities from non-western, English-speaking initiatives, literature and organisations to be identified.
Interviews were conducted with a small, non-representative sample of the professionals involved in this field. While we made efforts to reach out to professionals from varied backgrounds, we recognise that we have not represented the views of smaller, more grassroots organisations, or perspectives from larger private companies or regulators.
The combination of methods in this study were insufficient to represent practices in detail – for example, to demonstrate the extent to which mechanisms like governance oversight affected the development of power in decision-making, which meant that distinctions between collaboration and empowerment in practice were sometimes hard to make.

Together, these limitations highlight that this review does not intend to present itself as comprehensive, nor a definitive perspective on how the participatory and inclusive data stewardship landscape has developed. We hope instead that this review provides a useful snapshot of themes, questions, challenges and learning from the landscape, as a basis for further research.

Acknowledgements

This review was lead authored by Octavia Field Reid and Roshni Modhvadia, with substantive input from Valentina Pavel and Luke Patterson.

We are grateful to the following interviewees and reviewers:

Interviewees:

Tim Davies, Director of Research and Practice, and Dr Jeni Tennison, Executive Director, Connected by Data

Professor Sylvie Delacroix, Inaugural Jeff Price Chair in Digital Law, King’s College London

Jennifer Ding, Senior Researcher, The Alan Turing Institute

Arne Hintz, Reader, Data Justice Lab, Cardiff University

Maui Hudson, Director, Te Kotahi Research Institute

Joe Massey, Senior Researcher, Open Data Institute

Vinay Narayan, Senior Manager, Aapti Institute

Gwen Phillips, Ktunaxa Nation

Stefaan Verhulst, Co-Founder, GovLab and the Data Tank

Reviewers:

Jessica Montgomery, Director, ai@cam, University of Cambridge

Kasia Odrozek, Director, Insights, Mozilla Foundation

Reema Patel, Elgon Social Research, and Policy and Public Engagement Lead, Digital Good Network

Aidan Peppin, Policy and Responsible AI Lead, Cohere for AI

Dr Emily Rempel, Public Participation Manager, Liverpool Civic Data Cooperative

This research was conducted in collaboration with the ESRC-funded Digital Good Network through the University of Sheffield (grant reference ES/X502352/1), and the Liverpool City Region Civic Data Cooperative funded by the Liverpool City Region Combined Authority and hosted by the University of Liverpool.

Appendix

1. Legal underpinnings for data stewardship: EU

We have focused on EU legislation because that is where most advances are being made in provisions that have the potential to underpin data stewardship mechanisms. Here we go into more detail about specific regulatory examples:

Two complementary pieces of legislation, the Digital Services Act (DSA) and the Digital Markets Act (DMA), include a set of prohibitions to address harmful practices and unfair market practices which bring more granularity to the GDPR architecture of protection described above. For example, the DSA prohibits using manipulative design techniques and targeted advertising based on exploiting sensitive data.

The DSA also mandates increased transparency and accountability for key platform services (such as providing the main parameters used by recommendation systems) and includes obligations for large companies to perform systemic risk assessments. These provisions set important stewardship boundaries around the use of data and provide a necessary layer of information about how systems work and potential impact and consequences.

Accompanying the DSA, the Digital Market Act (DMA) introduces several prohibitions (such as combining or cross-using personal data without user consent), and aims to prevent exploitative behaviour and enable more choice for users. For example, the GDPR right to data portability is expanded, allowing end users to have real-time and continuous access to data provided through the use of the service, as well as data generated through their activity on core services such as marketplaces, app stores, search and social media. This mechanism is an important element of the stewardship toolbox, that not only empowers users with more choice over their data, but also supports innovation by making it easier for new services to attract users who can conveniently switch their data from one company to the other.

Some measures are designed to support data sharing and present alternatives to dominant ecosystem actors. The cross-sectoral 2023 Data Governance Act (DGA) aims to facilitate data sharing through a framework of use and re-use of data that includes mechanisms enabling the reuse of public-sector data, the development of sectoral common data spaces (such as for health, transport, energy, environment and finance), the design of ‘data intermediaries’ and voluntary mechanisms for individual data sharing through ‘data altruism’.^[218] The ‘data intermediary’ leg of the DGA is intended to act as an ‘alternative model to the data-handling practices of the Big Tech platforms’^[219] and envisions organisations and companies offering data intermediation services as ‘trusted intermediaries’.

Intermediation services fall under three main categories: 1) data marketplace and data exchange services that, for example, facilitate data sharing in more secure ways, management of data permissions and preferences, services to anonymise data; 2) intermediation services that facilitate sharing of personal and non-personal data and services that enable the exercise of the data subjects’ rights under the GDPR; 3) services of data cooperatives.^[220] Creating trust and preserving neutrality in this vision of data stewardship means companies are required to structurally separate the intermediation service from other commercial services they might be providing.

These measures are supporting a growing ecosystem: at the time of writing, one year after the application of the law, the EU register of data intermediation services lists 11 data intermediary services and one data altruism organisation.^[221] From the registered data intermediaries, two providers mention they offer ‘services of data cooperatives’ and one mentions that it intends to begin offering these services in 2025. However, it is unclear from the registration notifications how the providers distinguish between data marketplace and data exchange services and data cooperatives service.^[222]

As described in the introduction, there are global initiatives that aim to create and strengthen public institutions that collect and use data for economic and societal benefit. Following up from the DGA, the sectoral legislation formally setting up the European Health Data Space has been institutionally agreed in 2024 and is meant to facilitate the reuse of data for research, innovation, regulatory and public policy purposes across the EU.^[223] It also includes the option for ‘trusted data holders’ to securely process requests for access to health data.^[224] Furthermore, the EU regulation on high-value datasets aims to ‘fuel artificial intelligence and data-driven innovation’ by making more datasets publicly available, and includes enabling access to geospatial, meteorological, statistics, company and mobility data that can ‘generate important societal and economic benefits’.^[225]

The European Data Act (DA) includes new rules around access and sharing of data from connected devices among businesses and end users.^[226] For example, end users are able to access the data they generate through their use of the connected product (such as cars, fitness devices) or related service (additional controls or customisations used together with the connected device such as temperature adjustments in smart home devices). It also includes provisions for public-sector bodies to request access to private data in exceptional circumstances such as during national emergencies. Additionally, it includes rules on interoperability for cloud providers and for common data spaces. These rules will become applicable in 2025.

2. Frameworks for participation

Conceptual frameworks that support participation

Arnstein’s ladder of participation (1969)

Description: A theoretical conception of how the public can participate in governance and democracy that has become foundational to contemporary thinking about public participation. Arnstein was a researcher and policymaker who was highly critical of ‘tokenistic’, ineffectual consultation methods that appear to empower people, but fail to shift power or create accountability mechanisms between citizens and decision-makers.

Significance: The ladder equates participation with power and has a political aim: to redefine participation as redistribution of power, to support greater empowerment of people in political and economic processes with a goal to enable them to share in the benefits of an affluent society. It identifies the different forms of empowerment achievable through participation, from non-participation to tokenism (consulting and informing) to empowerment through citizen control, delegated power and partnership.^[227]

Citizen power

– Citizen control

– Delegated power

– Partnership

Tokenism

– Placation

– Consultation

– Informing

Non-participation

– Therapy

– Manipulation

IAP2 participation spectrum

Description: A conceptual framework, designed to support identification and selection of the appropriate level of participation for the public in any public participation process.

Significance: The spectrum maps the mechanism of participation to the purposes of informing, consulting, involving, collaborating and empowering: for example, where the purpose is to provide members of the public with balanced and objective information about a particular project, proposal or intervention, then the mechanism is to ‘inform’; whereas if the purpose is to partner with members of the public across key moments of decision-making, the mechanism is more aligned with ‘collaborate’. The spectrum also outlines plain language ‘promises’ to the public, to guide behaviour. For example, to ‘empower’ the public means to place final decision-making in their hands, with a promise to the public from power-holders to implement what the public decides.^[228]

Mechanism for impact on decision-making	Inform	Consult	Involve	Collaborate	Empower
Public participation purpose	To provide the public with balanced and objective information to assist them in understanding the problem, alternatives, opportunities and/or solutions	To obtain public feedback on analysis, alternatives and/or decisions.	To work directly with the public throughout the process to ensure that public concerns and aspirations are consistently understood and considered.	To partner with the public in each aspect of the decision including the development of alternatives and the identification of the preferred solution.	To place final decision-making in the hands of the public.
Promise to the public	We will keep you informed.	We will keep you informed, listen to and acknowledge concerns and aspirations, and provide feedback on how public input influenced the decision.	We will work with you to ensure that your concerns and aspirations are directly reflected in the alternatives developed and provide feedback on how public input influenced the decision.	We will look to you for advice and innovation in formulating solutions and incorporate your advice and recommendations into the decisions to the maximum extent possible.	We will implement what you decide.

Ada Lovelace Institute Participatory Data Stewardship framework

Description: A conceptual framework for understanding how stewardship – particularly in relation to data but applicable to other domains – can be carried out with meaningful participation. Drawing inspiration from both Arnstein’s ladder of citizen participation and the IAP2 spectrum, it demonstrates ways that people can gain increasing levels of control and power over their data. It also provides examples of specific mechanisms and practices that can facilitate this.^[229]

Significance: The framework describes in detail factors that contribute to each of the IAP2 objectives. For example, people can be informed about what is happening to data about or affecting themselves through organisations adopting ‘meaningful transparency’ and explainability, to being consulted through community engagements and public attitudes research. Or when people are empowered, they play a significant role in making decisions about how data is governed, through continuously shaping rules, being involved in data-access mechanisms, owning or controlling mechanisms like cooperatives or otherwise deciding terms of data access or licensing. Where the IAP2 framework aims to ensure that power-holders implement public wishes, this framework pushes empowerment towards power-holders providing advice and assistance on request, to support publics to design and develop their own data governance frameworks.

Purpose	Description (drawn from Arnstein’s ladder of participation)^[230]	What people can expect (drawn from the IAP2 Participation Spectrum)^[231]	Relevant participatory mechanisms, methods and activities
Informing	‘A one way flow of information’	‘We will keep you informed on how your data is being used’	• Meaningful and radical transparency • Explainability – the process of enabling a data-driven system to be explained in human terms • Mechanisms such as model cards and data sheets • Rethinking and reframing how we talk about data, so it is more accessible and inclusive
Consulting	‘Inviting people’s opinions, through attitude surveys, Neighbourhood meetings and public hearings’	‘We will listen to, acknowledge concerns and aspirations, and provide feedback on how public input influenced the data-governance framework’	• Community networks • User experience (UX) testing and co-design • Surveys and public attitudes research • Community engagements and consultations
Involving	‘Allow citizens to advise, but retain for powerholders the continued right to decide’	We will work with you to ensure your concerns and aspirations are directly reflected in data governance… we will provide feedback on how public input influenced these decisions’	One-off and institutionalised public deliberation and deliberative democracy initiatives • Lived experience panels • Horizon scanning, design thinking and futures thinking
Collaborating	‘Enables people to negotiate and engage in trade-offs with powerholders’	‘We will look to you for advice and innovation in design of data-governance frameworks… and incorporate your advice and recommendations to the maximum extent possible’	• Public deliberation and deliberative democracy initiatives • Bottom-up ‘data governance initiatives’ managed by an independent fiduciary (for example, data trusts, and data-sharing contracts that build in collaboration) • Participant panels and data donation mechanisms
Empowering	‘Citizens obtain the majority of decision-making seats or full managerial power’	‘We will provide advice and assistance as requested in line with your decisions for designing/developing your own data-governance framework’	Data governance rules shaped and routinely reviewed by beneficiaries and data donors • Voting on governance boards of data- access initiatives • Ownership and/or control of data cooperatives • Setting terms of data licensing and access, with permissions overseen by citizens

Operational frameworks that support participation

Mozilla’s Practical Framework for Applying Ostrom’s Principles to Data Commons Governance (2021) [232]

Description: Mozilla’s Practical Framework for Applying Ostrom’s Principles to Data Commons Governance (2021) supports the development of data stewardship based on commons principles, which centre sustainable and ethical production, redistribution and collaboration.

Significance: Through research that identifies existing, successful practices, it offers two frameworks based on Ostrom’s eight principles – the first framework supports planning and evaluation of governance of existing data commons initiatives, and the second proposes detailed design principles to support the robust development of data commons governance.^[233] Both frameworks understand that it is not simply the data governance that needs to be considered in a responsible, participatory and inclusive way, but all the practical (sociotechnical) infrastructure that sits around the data, including finances or community behaviours.

Design principle	Description
Clearly-defined boundaries	Individuals who have rights to appropriate resources must be clearly defined, as must the boundaries of the resource itself. Your data commons is bounded by a well-defined purpose, a set of values you prioritise, a well-scoped mission, and a clear sense of who you are doing this with and for. Together, these determine who can contribute, access, and use the data resource or make decisions about it. It also helps determine the shape and context of the data resource itself.
Appropriate rules	Rules are appropriately related to local conditions (including both regarding the appropriation of common resources — restricting time, place, technology, quantity, etc.; and rules related to provision of resources — requiring labor, materials, money, etc.) The various resources the commons stewards, such as data, people’s time, funding, as well as the organisation itself, have appropriate rules to describe how they can be used and under what conditions. In general, the rules should ensure that those who contribute resources benefit from their contribution and that harms from the use of resources are curtailed.
Rule-making processes	Collective-choice arrangements allow most resource appropriators to participate in the decision-making process. In short: those who are affected by decisions and rules that govern the resource or the community itself should have a way of influencing those decisions.
Monitoring	Effective monitoring by monitors who are part of, or accountable to, the appropriators. This means that compliance with the rules established is monitored and that users of the commons have an active role in monitoring compliance. With regard to data commons, this includes monitoring of data production processes — ongoing validation of data integrity, verification of data quality, — as well as monitoring data access and use.
Sanctions	There is a scale of graduated sanctions for resource appropriators who violate community rules. This principle refers to the set of accountability measures that should be in place to guarantee rules are enforced. However, the focus on graduated sanctions implies that not every violation of a rule is treated the same and, for instance, intent and harm are taken into account when applying sanctions.
Conflict-resolution mechanisms	Appropriators and their officials have rapid access to low-cost local arenas to resolve conflicts among appropriators or between appropriators and officials. When conflict arises in a data commons, there needs to be an effective, inexpensive, and otherwise accessible way to handle that conflict. In addition, a data commons needs to decide and make clear which conflicts will be handled internally and which ones should be resolved externally, for instance by going to court.
Right to self-governance	The rights of a community to devise and govern its own institutions is recognized by external authorities. In the context of data commons, this principle encourages us to understand how far the decisions we make about the collection and use of data are in line with, for instance, data protection regulations.
Nestedness / interoperability	Appropriation, provision, monitoring, enforcement, conflict resolution, and governance activities are organized in multiple layers of nested enterprises. In relation to data commons, this principle can refer to a possible need for one data commons to interoperate with another, or to break-up one large commons into smaller, nested commons that interoperate with one another. Doing so would allow each smaller commons to make decisions that better reflect their circumstance and match a narrowly defined purpose.

The content in this table, from Mozilla’s Practical Framework for Applying Ostrom’s Principles to Data Commons Governance is licensed under a CC-BY-NC-SA licence. [234]

A data stewardship framework for NZ ^[235]

Description: A national government framework and toolkit that is designed to enable the New Zealand government to ‘better manage and use the data it holds on behalf of New Zealanders’.

Significance: This framework is significant because it represents a national government approach to stewarding data, covering stewardship of the data system to fulfil statutory obligations, for example to protect privacy, while also maintaining trust and maximising the value of data. It is the most wide-ranging of the frameworks described here, encompassing stewardship of data created, collected and used by organisations and individuals. The framework contains seven key elements: strategy and culture; rules and settings; roles, responsibilities and accountabilities; data capability and quality; people capability and literacy; influence and advocacy; monitoring and assurance.

The 7 key elements for effective data stewardship:

Strategy and culture	A strategy that provides a shared vision and clear direction, and a data culture that enables strategy implementation and sustains good data stewardship practice.
Rules and settings	Legislation, policies, principles, and sanctions providing boundaries and guiding how the data system should operate.
Roles, responsibilities, accountabilities	Governance structures, role definitions and expectations, and leadership.
Data capability and quality	Tools, processes, designs, metadata structures, and platforms for managing, storing, describing, and sharing data.
People capability and literacy	Skills, knowledge, and services for accessing, managing, analysing, and communicating data and insights.
Influence and advocacy	Effective relationships and networks to endorse, promote, and support good data practice.
Monitoring and assurance	Assessing environmental trends and developments, measuring stewardship performance, and adapting the stewardship toolkit to respond to changing circumstances or new information.

The content in this table, from A data stewardship framework for NZ, is licensed under a CC-BY-4.0 licence.[236]

Open Data Institute: What makes participatory data initiatives successful?

Description: Two separate evaluative frameworks that express different aspects of participation: degrees of involvement in participatory practices and diversity in the participatory data stewardship ecosystem across a number of considerations – and factors that contribute to their success.

Significance: The first framework sets out measures of whether people are participating in some or all of the creation, design or decision-making or policy around data; whether the initiative is top-down, bottom-up or a mixture of these two approaches; what terms are used to describe different kinds of participation; whether the initiatives are embedded or experimental; and who stands to benefit from the process and outcomes of participation, and in what way – whether it is organisations creating legitimacy for themselves, communities or participants? The second addresses how those implementing participatory approaches conceptualise and define their success; which success factors impact an initiative adopting participatory approaches, and whether there is a unifying vision of success across different initiatives, stakeholder groups or geographies? It is designed to guide people setting up and evaluating participatory initiatives.

Degrees of involvement in participatory practices:^[237]

Representation	Who is able to participate, and do they represent the people who are at risk from exclusion, discrimination, or who might benefit most from having a say?
Motivations, incentives and remuneration	How are people compensated for their contributions when they participate in data practices? Why do people participate and how can incentives be used to make fairer and more representative data systems? What is the motivation for an organisation or government to implement these approaches?
Sustainability of participation	Will people continue to participate over time, or are initiatives creating ‘one-off’ experiences?
Experiences of participation	What do people learn, do they experience a sense of empowerment themselves?

Different aspects of participatory initiatives that contribute to success (based on a synthesis of literature and interviews, and noting variances for the Global South):^[238]

Participants	Initiatives should involve diverse representatives of communities as participants and champions.
Other stakeholders	Including the internal team, champions, steering committee or advisory board, these people drive initiatives forward and require interdisciplinary skillsets that support direction, mentoring and impact.
Inputs	Adequate funding, time and technological understanding can build flexibility and resilience into participatory initiatives.
Design	Design choices including participatory methodologies, rigour of research methods, involvement and incentivisation of participants, legal/governance models have significant impacts on outcomes
External factors	Aspects such as the political climate, system dynamics, legal considerations and relationships with decision-makers can positively or negatively impact success.

The report looks across aspects of participatory initiatives that contribute to success (based on a synthesis of literature and interviews, and noting variances for the Global South) include involving diverse representatives of communities, the role of experts and participants, and practical factors like funding, time and technological understanding – all within the confines of a more-or-less hospitable political power system.^[239]

Aapti Fostering Participatory Data Stewardship

Description: A multi-layered, cross-sectoral value proposition of participatory mechanisms for data governance, expressed through paradigm analysis across three sectors: Environment and Sustainability, Healthcare and Urban Governance. It focuses on the role of the public sector in ecosystem enablement, technical pathways for operationalising community participation in data stewardship and scaling community governance through data stewardship.

Significance: The playbook is the most complete mapping of norm development currently available, building on case studies and experiences of real-world initiatives. The content is too complex to map here. Its approach matches operational challenges to strategies, to describe in detail barriers or obstacles and their counterpart solutions, for example: ‘Participation is structured as a ‘one-off’ engagement and does not persist throughout the lifecycle of data usage & governance’ produces the strategies ‘Identify incentives of data generators (individuals and communities) through consultation to better identify approaches to embed participation’ and ‘Remodel the prevailing regulatory landscape for data governance to embed mechanisms for community participation throughout the lifecycle of data usage.’^[240]

Frameworks that support inclusionFAIR principles [241]

Description: FAIR (Findable, Accessible, Interoperable and Reusable) are technical, operation principles developed in 2016 in response to increasing interest in large datasets for discovery and innovation in the interests of the scientific community and the public, and reliance on computational processes in data management.

Significance: The principles reflect the values of fair information practices and emphasise ‘machine actionability’ (the ability of machines to access and reuse data without human intervention), to optimise ‘the management, preservation and sharing of research data in order to maximise its utility as a reusable resource’.^[242] They are widely used in data governance, and a vital technical underpinning to legal mechanisms for data stewardship, particularly where power differentials exist between data subjects and those who wish to use datasets.^[243]

Findable	(Meta)data are assigned a globally unique and persistent identifier Data are described with rich metadata (defined by R1 below) Metadata clearly and explicitly include the identifier of the data they describe (Meta)data are registered or indexed in a searchable resource
Accessible	(Meta)data are retrievable by their identifier using a standardised communications protocol The protocol is open, free, and universally implementable The protocol allows for an authentication and authorisation procedure, where necessary Metadata are accessible, even when the data are no longer available
Interoperable	(Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. (Meta)data use vocabularies that follow FAIR principles (Meta)data include qualified references to other (meta)data
Reusable	(Meta)data are richly described with a plurality of accurate and relevant attributes (Meta)data are released with a clear and accessible data usage license (Meta)data are associated with detailed provenance (Meta)data meet domain-relevant community standards

CARE Principles for Indigenous Data Governance [244]

Description: CARE principles support the rights of Indigenous peoples to control data about their peoples, land and resources, particularly when there are tensions between protecting Indigenous communities’ rights and supporting data sharing for research and public benefit. Used in situations where equitable participation and outcomes are required in data access and use, they are designed to be complementary to, and are frequently considered in relation to, FAIR principles.^[245]^{, [246]}

Significance: The principles were developed to ‘ensure that decisions made about data pertaining to Indigenous communities […] are responsive to their values and collective interests,’^[247] and go beyond consideration of the rights of individual data subjects to consideration of the effects on groups and communities, particularly in relation to risks that accumulate through practices of data matching and prediction. As well as explicitly addressing power imbalances that affect the rights of Indigenous peoples, the principles are a foundational model for considering affected people in data stewardship.

CARE Principles for Indigenous Data Governance

Collective benefit	● For inclusive development and innovation ● For improved governance and citizen engagement ● For equitable outcomes
Authority to control	● Recognising rights and interests ● Data for governance ● Governance of data
Responsibility	● For positive relationships ● For expanding capability and capacity ● For Indigenous languages and worldviews
Ethics	● For minimising harms and maximising benefit ● For justice ● For future use

The content in this table, from CARE Principles for Indigenous Data Governance, is licensed under a CC BY-NC-ND 4.0 licence [248]

Joint Research Centre (European Commission) Mapping the landscape of data intermediaries^[249]

Description: While not described as a framework, the JRC (EC) mapping describes properties of inclusive data governance in relation to data intermediaries. In the analysis, two broad features of inclusive data governance (control and agency, and value and benefit sharing) are mapped in a normative way against properties of data governance approaches, describing when data governance practices are inclusive. In relation to control and agency, inclusive data governance will enable greater diversity in access to data and decision-making in relation to how data is shared, accessed and used, when compared to data-sharing by large corporate platforms. In relation to value and benefit sharing, inclusive data governance will enable different kinds of value generation, including economic, social and moral value, and the ability to negotiate rights including privacy.

Significance: The properties described support horizontal relationships between different parties involved in data collection and use. They also offer ways for less dominant ecosystem actors to access and share data, encourage data sharing in the public interest and support empowerment of data subjects and communities.

Features of inclusive data governance	Data governance approaches are inclusive when…
Control and agency	…a broader range, and more diverse types, of actors can access data and decide how data are shared, accessed and used, compared to unilateral agreements typically set by Big Tech platforms.
	…the interests, needs and rights of those who currently have less power in the data economy (such as citizens, civil society organisations, but also SMEs and local public authorities) are protected and promoted.
	…data holders, data subjects, and all individuals represented in data, or affected by how data are used, can rely on governance mechanisms and tools that allow them to have agency over their data.
	…there is participation around data. Participation takes place in different forms – from public dialogues and ethics committees to clear terms of services and privacy-enhancing technologies, and from individuals controlling their personal data to collective forms of bargaining over data rights and collaborative decision-making on data commons.
	…SMEs and start-ups have greater control over their data, and when they can access relevant data sources and use them to build functional (and socially relevant) data-driven services instead of being subjected to the information asymmetries of data monopolies.
Value and benefit sharing	…different kinds of value are generated from data across multiple value chains and sectors of society. The different types of value range from economic and social to moral, including the ability to negotiate privacy.
	…the value generated from data is spread across a wide range of actors and collective claims can be made about value creation and distribution.
	…the outcomes of data use are distributed fairly, for instance when some of the profits that emerge from commercial data are returned to the public domain or to other actors that enabled the production of these data in the first place (for example, from public infrastructures to citizens’ digital footprints). This resonates with notions such as ‘data solidarity’ and ‘data as utility’.
	…economic value generated from data use is created through partnerships established to optimise data value chains in a sector or logistics.

The content in this table, from Joint Research Centre (European Commission) Mapping the landscape of data intermediaries, is licensed under a CC-BY-4.0 licence [250]

3. Detailed findings from the desk-based research

Geography

Most organisations were based in the west. Figure 5 depicts a map of the countries organisations were based in. Nearly two-fifths (37%) were from North America, 20% from the UK, and 12% from other European countries. Nearly one-fifth (19%) were spread across the rest of the world, with representation from countries across South America, Asia, Africa, Australia and New Zealand. Six per cent of the organisations we identified were international in nature.

Type of organisations/institutes

For this analysis, we looked at each unique organisation in the dataset, removing duplicate entries where one organisation contributed to more than one entry. We coded entries against broad categories of organisational types – e.g. academic institution, private organisation – but acknowledge that these boundaries are not concrete. We found that often organisations identifying as think tanks were also registered charities, that academic institutions housed research institutes and some organisations received funding through multiple institutional bodies, both public and private, making classification challenging.

Half of the organisations in the dataset (50%) were from academic institutions such as universities. Of the academic institutions 34% were based in the US while another 20% were based in the UK.

Fifteen per cent of the entries in the dataset were from non-profit organisations. These varied in size and focus, with some being issue-specific and regionally contained, such as the Connecticut Data Collaborative,^[251] which looks at local urban data, and others that are larger in size and engage more broadly on issues around data and data stewardship, such as the World Economic Forum and its publication on digital agency and data intermediaries.^[252]

Private organisations were the next largest proportion in the dataset, making up 10% of the organisations we identified. They included data storage solutions such as Cosy Cloud^[253], personal information management systems such as BitsAboutMe^[254] and consultancy service providers such as Hestia.AI.^[255]

Government funded organisations made up another 8% of the organisations in the dataset. They included participatory data governance initiatives from local government such as the Camden Council Data Charter^[256] and Manchester City Council’s smart city initiative, CityVerve,^[257] and nationally funded projects such as Genomics England.^[258]

Research institutes (for example,, the Institute for Scientific Interchange),^[259] think tanks (for example, Urban Institute),^[260] policy organisations (for example, the Information Accountability Foundation)^[261] and civil society organisations (for example, Jan Sahas Foundation)^[262] made up the smallest proportion of entries.

Applications of data

We identified the specific uses of data that were referred to in our dataset to better understand the context in which participatory and inclusive data stewardship practices are occurring in. Figure 7 shows the breakdown of the prevalence of different applications.

We found that 38% of entries referred to general applications of data – or rather, to no one specific application of data. In these instances, entries would often comment on various types of data institutions, such as data trusts or data intermediaries, or explore principles around data flows and ownership responsibilities more generally.

Applications in health were the second most common use of data identified. On-the-ground projects – that is, things happening in the real world – in this space covered voluntarily donated health data by members of the public, electronic health record platforms and secure research environments or repositories with health-related datasets for research and development purposes. Theoretical literature in this space tended to examine conditions for building trust in data sharing practices as well as the role of involving community members in decision-making processes around health data.

Remaining prevalent categories of data applications were community data (for example, neighbourhood data, urban planning), personal data (for example, voluntarily shared demographic and behavioural data) and environmental data (for example, agricultural data). Other less common applications of data included migration data, social care data, employment data and space research data.

Types of sources represented in our dataset

We also categorised entries based on whether they were theoretical pieces of work or on-the-ground, implemented projects. This distinction was made to understand the balance in the landscape between theory and practice.

We found that implemented projects – that is, activities that had been trialled or rolled out with real people – made up a slightly higher proportion of our dataset entries than theoretical work. Implemented projects ranged from national initiatives like New Zealand’s national data stewardship framework^[263] and the US National Institutes of Health Accelerating Medicines partnership^[264] to more localised projects such as Data Driven Detroit^[265] and Vision Philadelphia.^[266]

Theoretical projects fell into the following broad sub-categories:^[267]

Commentaries, which leaned on existing literature without performing a comprehensive review, introduced new ideas without formally grounding them in an empirical evidence base, were speculative in nature or were positioning in nature. Examples included commentaries on the application of data governance structures in specific contexts such as data trusts in African countries,^[268] data cooperatives in health research,^[269] and the data intermediaries in open data practices.^[270]
Case studies which drew on a small number of empirical examples and often made new recommendations or observations from these examples. They ranged from explorations of specific initiatives in depth, such as the Personal Data Trust Bank in Japan,^[271] to reviews of several similar initiatives to form recommendations and conclusions, such as a review of citizen-generated open data more broadly.^[272]
Literature review/landscape analysis synthesising existing literature in depth. They included systematic reviews^[273] and scoping reviews,^[274] as well as critical analysis of existing literature or implemented initiatives. Most often focusing on a varied range of data uses and governance structures, authorship tended to span across academia, think tanks, policy organisations and other research institutes.
Framework/toolkit/set of principles proposed for others to either follow or use. Frameworks, toolkits and principle setting sources varied from being context or regionally bound, to broader frameworks for more global use. National and local government guidance included New Zealand’s data stewardship framework and Camden Council’s Data Charter which we discussed earlier in this report. More general guidance included a United Nations framework for digital public infrastructure,^[275] and frameworks for data commons governance and data ownership.^[276] ^[277]
Original primary research which often included some sort of public engagement with members of the general public or workshops with specific stakeholders. Original research included stakeholder workshops (for example, a multistakeholder working group which was assembled to develop commitments for the ethical sourcing, use and reuse of patient data)^[278] and engagement with specific service users or people that are likely to be affected by specific data systems (for example, community-based research and interventions around data).^[279]

Figure 8 shows the broad proportions of each type of article within the theoretical literature. The following sections describe the literature within some of these subcategories.

Figure 8: Types of theoretical literature

This research builds on a programme of work Ada began in 2020, making the case for the societal value of data (beyond direct commercial or economic value), recognising and addressing asymmetries of power and data injustice, and promoting and enabling conditions for data stewardship.^[280]

In 2021, Exploring legal mechanisms for data stewardship explored the specific properties of existing legal mechanisms for stewarding predominantly personal data – data trusts, data cooperatives and corporate and contractual mechanisms. As well as supporting a trustworthy environment for organisations to share data, it identifies the importance of purpose and benefits to determine transparent and equitable relationships between individuals and organisations in data-sharing initiatives. It proposes empowering individuals to redefine the terms of data use, as well as describing the rights and responsibilities for stewarding data of (respectively) trustees, members and organisations.

In the same year, Participatory data stewardship set out a framework of participation to understand how and where people have different levels of power, control and agency over their data. This proposed a spectrum of participation purposes – inform, consult, involve, collaborate and empower – and a range of methods and activities that might map to those purposes – from community networks to one-off public deliberations, to bottom-up data governance initiatives managed by an independent fiduciary body, to routine shaping and review of data governance, data licensing, sharing and access permissions by beneficiaries. The report advocates for meaningful transparency and explainability, and reframing language and narratives around data governance.

In 2022, Rethinking data and rebalancing digital power recognised the entrenched power and ‘hyper-intermediated’ nature of the current digital economy and proposes new forms of data governance institutions, in which people and collectives control how data is generated, collected, used and governed. Subject to definition and testing, these could include ‘data unions’, which work in the interests of its members, towards individual or wider societal benefits; or a model in which members contribute device-generated data to a central database, owned collectively by the ‘data commons’, and inviting ethically minded entrepreneurs to build business models on these databases, and feed revenues back into the community.

It also proposes new models of accountability, including supervision and monitoring to support a new generation of responsible data intermediaries. And new forms of legal protection, for example efficient forms of legal redress in the event that a data intermediary acts against the interests of their beneficiaries. By increasing the power of data collectives and responsible intermediaries, the report proposes, it will be possible to curb the power of platforms that currently dominate through holding data and influence, and support civil society organisations to hold to account any ungoverned or unregulated, private or public exercises of power.

Finally, it sets out a vision for ensuring public participation in technology policymaking, in which ‘everybody who wants to participate in decisions about data and its governance can do so’. This is a radical and wide-reaching proposition: to achieve new forms of data governance, efforts need to go beyond political and legislative support, to disentangling the economic and infrastructural foundations (largely taking the shape of dependencies on large technology companies) and ‘restructuring the institutions and distributions of economic power’.

Footnotes

[1] Ada Lovelace Institute, ‘Rethinking Data and Rebalancing Digital Power’ (2022) <https://www.adalovelaceinstitute.org/project/rethinking-data/> accessed 3 November 2024.

[2] Ada Lovelace Institute, ‘Participatory Data Stewardship: A Framework for Involving People in the Use of Data’ (2021) <https://www.adalovelaceinstitute.org/report/participatory-data-stewardship/> accessed 3 November 2024.

[3] Sherry R Arnstein, ‘A Ladder of Citizen Participation’ (1969) 35 Journal of the American Institute of Planners 216.

^[4] ‘The Sudlow Review’ (HDR UK) <https://www.hdruk.ac.uk/helping-with-health-data/the-sudlow-review/> accessed 11 December 2024.

^[5] Ada Lovelace Institute and UK AI Council, ‘Exploring Legal Mechanisms for Data Stewardship’ (2021).

^[6] Ada Lovelace Institute (n 2).

^[7] Ada Lovelace Institute (n 1).

^[8] Ada Lovelace Institute and UK AI Council (n 5).

[9] The research data is provided here for use by other researchers: <https://docs.google.com/spreadsheets/d/1HCTJt9fmCm08GFauA_wVKO86GwsEPaXczGMzS3UDuto/>.

^[10] This work, carried out by the Digital Good Network, will be published in 2025 to complete the Participatory and Inclusive Data Stewardship programme.

^[11] ‘The Native BioData Consortium’ (Native BioData Consortium) <https://nativebio.org/> accessed 1 December 2024.

^[12] ‘ABALOBI’ <https://abalobi.org/about/> accessed 1 December 2024.

^[13] ‘The Data Assembly | The GovLab’ <https://thedataassembly.org/> accessed 1 December 2024.

^[14] ‘Saidot’ <https://www.saidot.ai/> accessed 1 December 2024.

^[15] Ada Lovelace Institute (n 2).

^[16] ‘Our Future Health’ (Our Future Health) <https://ourfuturehealth.org.uk/> accessed 1 December 2024.

^[17] ‘Safetipin | Safetipin, Creating Safe Public Spaces for Women’ <https://safetipin.com/> accessed 1 December 2024.

^[18] ‘Aya | Cohere For AI’ <https://cohere.com/research/aya> accessed 1 December 2024.

^[19] ‘Data Trusts’ (Data Trusts Initiative) <https://datatrusts.uk> accessed 1 December 2024.

^[20] ‘Driver’s Seat Cooperative’ (Driver’s Seat Cooperative) <https://www.driversseat.co> accessed 20 November 2024.

^[21] Information from email from Driver’s Seat Co-founder Hays Witt, reposted on Reddit, 2023: <https://www.reddit.com/r/couriersofreddit/comments/187xd1o/drivers_seat_cooperative_is_closing/> accessed 20 November 2024.

^[22] ‘WAO – The Workers’ Algorithm Observatory’ <https://wao.cs.princeton.edu/> accessed 1 December 2024.

^[23] Joint Research Centre (European Commission) and others, Mapping the Landscape of Data Intermediaries: Emerging Models for More Inclusive Data Governance (Publications Office of the European Union 2023) <https://data.europa.eu/doi/10.2760/261724> accessed 6 November 2023.

^[24] For a UK example, see the high numbers of patients who refused to participate in GPDPR: NHS England Digital. ‘We Must Listen to the Public on GP Data’. <https://digital.nhs.uk/blog/data-points-blog/2022/we-must-listen-to-the-public-on-gp-data> Accessed 5 December 2024.

^[25] Helen Kennedy and others, ‘Living with Data Survey Report’ (Living with Data 2021).

^[26] Joint Research Centre (European Commission) and others (n 23) 18.

^[27]Jathan Sadowsky, Ada Lovelace Institute (n 1) 60.

^[28] ibid. 62.

^[29] Rediet Abebe and others, ‘Narratives and Counternarratives on Data Sharing in Africa’, Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Association for Computing Machinery 2021) <https://doi.org/10.1145/3442188.3445897> accessed 9 May 2024.

^[30] Jathan Sadowsky in Ada Lovelace Institute (n 1) 60 & 65 & Ada Lovelace Institute ‘The Political Economy of Data Intermediaries’. <https://www.adalovelaceinstitute.org/blog/political-economy-data-intermediaries/> accessed 1 December 2024.

^[31] Joint Research Centre (European Commission) and others (n 23).

^[32] Lara Groves and others, ‘Going Public: The Role of Public Participation Approaches in Commercial AI Labs’, 2023 ACM Conference on Fairness, Accountability, and Transparency (ACM 2023) <https://dl.acm.org/doi/10.1145/3593013.3594071> accessed 16 August 2024.

^[33] Felippe A Cronemberger and J Ramon Gil-Garcia, ‘Characterizing Stewardship and Stakeholder Inclusion in Data Analytics Efforts: The Collaborative Approach of Kansas City, Missouri’ (2022) 16 Transforming Government: People, Process and Policy 412.

^[34] ‘Fostering Participatory Data Stewardship | Aapti Institute’ <https://aapti.in/fostering-participatory-data-stewardship/> accessed 6 November 2024.

^[35] Patricia Wilson and others, ‘Research with Patient and Public Involvement: A Realist Evaluation – the RAPPORT Study’ (2015) 3 Health Services and Delivery Research 1.

^[36] Richard Milne, Annie Sorbie and Mary Dixon-Woods, ‘What Can Data Trusts for Health Research Learn from Participatory Governance in Biobanks?’ (2022) 48 Journal of Medical Ethics 323.

^[37] See Ada Lovelace Institute and Wellcome Trust Understanding Patient Data ‘Foundations of fairness: <https://www.adalovelaceinstitute.org/blog/the-foundations-of-fairness-for-nhs-health-data-sharing/> accessed 30 November 2024.

^[38] Sam HA Muller and others, ‘Dynamic Consent, Communication and Return of Results in Large-Scale Health Data Reuse: Survey of Public Preferences’ (2023) 9 Digital Health 20552076231190997.

^[39] ‘Policy Intervention 5: Empowering People to Have More of a Say in the Sharing and Use of Data for AI’ (The ODI, 5 July 2024) <https://theodi.org/news-and-events/blog/policy-intervention-5-empowering-people-to-have-more-of-a-say-in-the-sharing-and-use-of-data-for-ai/> accessed 1 December 2024.

^[40] Groves and others (n 32).

^[41] Harini Suresh and John V Guttag, ‘A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle’, Equity and Access in Algorithms, Mechanisms, and Optimization (2021) <http://arxiv.org/abs/1901.10002> accessed 14 November 2023.

^[42] Meg Young and others, ‘Participation versus Scale: Tensions in the Practical Demands on Participatory AI’ [2024] First Monday <https://firstmonday.org/ojs/index.php/fm/article/view/13642> accessed 21 November 2024 2.

^[43] Tali Ramsey, ‘Facebook and Instagram Plans to Use UK Posts to Train AI Models – Which? News’ (Which?, 26 September 2024) <https://www.which.co.uk/news/article/facebook-and-instagram-plans-to-use-uk-posts-to-train-ai-models-av4gw4R8FjpE> accessed 1 December 2024.

^[44] Groves and others (n 32).

^[45] Deepa Seetharaman, ‘For Data-Guzzling AI Companies, the Internet Is Too Small’ (WSJ) <https://www.wsj.com/tech/ai/ai-training-data-synthetic-openai-anthropic-9230f8d8> accessed 1 December 2024.

^[46] Fox Meyer, ‘Law Can’t Protect World-Leading Te Reo Māori AI Database’ (Newsroom, 11 September 2024) <http://newsroom.co.nz/2024/09/12/new-zealand-law-not-prepared-to-protect-world-leading-te-reo-maori-ai-database/> accessed 17 November 2024.

^[47] See G20 policy brief: <https://www.t20brasil.org/media/documentos/arquivos/TF05_ST_05_Democratizing_AI_fo66d5d70141505.pdf> and AI Action Summit Public Interest AI theme: <https://www.elysee.fr/en/sommet-pour-l-action-sur-l-ia/public-interest-ai>.

^[48] ‘What Are Data Institutions and Why Are They Important?’ (The ODI, 29 January 2021) <https://theodi.org/insights/explainers/what-are-data-institutions-and-why-are-they-important/> accessed 11 June 2024.

^[49] ibid.

^[50] ‘What Does It Mean? | Shifting Power Through Data Governance’ (Mozilla Foundation) <https://foundation.mozilla.org/en/data-futures-lab/data-for-empowerment/shifting-power-through-data-governance/> accessed 13 September 2024.

^[51] Ada Lovelace Institute and UK AI Council (n 5).

^[52] ‘What Does It Mean? | Shifting Power Through Data Governance’ (n 50).

^[53] Sara Rosenbaum, ‘Data Governance and Stewardship: Designing Data Stewardship Entities and Advancing Data Access’ (2010) 45 Health Services Research 1442.

^[54] ‘Valuing Data’ (Bennett Institute for Public Policy) <https://www.bennettinstitute.cam.ac.uk/research/research-projects/valuing-data/> accessed 1 December 2024.

^[55] Elinor Ostrom used the term stewardship to describe the governance of common resources and developed design principles for collective governance. Though often focused on shared natural resources like pastures, forests or fisheries, applying Ostrom’s principles to data is useful for thinking about trade-offs and public benefit. See Ostrom, E. (2015). Governing the Commons. Cambridge: Cambridge University Press.

^[56] Abebe and others (n 29).

^[57] Nick Couldry and Ulises Ali Mejias, ‘The Decolonial Turn in Data and Technology Research: What Is at Stake and Where Is It Heading?’ (2021) 0 Information, Communication & Society 1.

^[58] Verhulst, S., ‘Reimagining data responsibility: 10 new approaches toward a culture of trust in re-using data to address critical public needs’, Data & Policy, Vol. 3, 2021. doi:10.1017/dap.2021.4

^[59] Hicks, Jacqueline. ‘The Future of Data Ownership: An Uncommon Research Agenda’. The Sociological Review 71, no. 3 (1 May 2023): 544–60. https://doi.org/10.1177/00380261221088120.

^[60] Sylvie Delacroix and Neil D Lawrence, ‘Bottom-up Data Trusts: Disturbing the “One Size Fits All” Approach to Data Governance’ (2019) 9 International Data Privacy Law 236.

^[61] Barbara Prainsack, Logged out: Ownership, Exclusion and Public Value in the Digital Data and Information Commons (2019) 6 Big Data & Society.

^[62] Joint Research Centre (European Commission) and others (n 23).

^[63] Centre for Data Ethics and Innovation, ‘Unlocking the Value of Data: Exploring the Role of Data Intermediaries’ (2021) <https://www.gov.uk/government/publications/unlocking-the-value-of-data-exploring-the-role-of-data-intermediaries>.

[64] Stefaan G Verhulst, ‘Data Stewardship Re-Imagined — Capacities and Competencies’ (Data Stewards Network, 14 October 2021) <https://medium.com/data-stewards-network/data-stewardship-re-imagined-capacities-and-competencies-d37a0ebaf0ee> accessed 12 December 2024.

^[65] ‘What Are Data Institutions and Why Are They Important?’ (n 48).

^[66] Data Management and Use: Governance in the 21st Century – a British Academy and Royal Society Project | Royal Society\\uc0\\u8217{} <https://royalsociety.org/news-resources/projects/data-governance/> accessed 1 December 2024.

^[67] ‘What Does It Mean? | Shifting Power Through Data Governance’ (n 50).

^[68] ‘Data Trusts in 2020’ (The ODI, 17 March 2020) 202 <https://theodi.org/news-and-events/blog/data-trusts-in-2020/> accessed 1 December 2024.

^[69] Open Data Institute, ‘Responsible Data Stewardship’ (2024) <https://theodi.org/insights/impact-stories/responsible-data-stewardship/>.

^[70] Aapti Institute, ‘Situating Civil Society Organisations on the Stewardship Spectrum’ (Aapti Institute, 5 July 2022) <https://medium.com/aapti/situating-civil-society-organisations-on-the-stewardship-spectrum-48e3e4a9b4e9> accessed 1 December 2024.

^[71] Rosenbaum (n 53).

^[72] For an outline of the role requirements for a data steward that demonstrates the complexity of information that is required to begin a stewarding initiative or role, see Verhulst, Stefaan G. ‘Wanted: Data Stewards – (Re-)Defining the Roles and Responsibilities of Data Stewards for an Age of Data Collaboration’. GovLab, 21 March 2020. https://doi.org/10.15868/socialsector.40383.

^[73] Ada Lovelace Institute and UK AI Council (n 5).

^[74] Joint Research Centre (European Commission) and others, Mapping the Landscape of Data Intermediaries: Emerging Models for More Inclusive Data Governance (Publications Office of the European Union 2023) <https://data.europa.eu/doi/10.2760/261724> accessed 6 November 2023.

^[75] Centre for Data Ethics and Innovation, ‘Unlocking the Value of Data: Exploring the Role of Data Intermediaries’ (2021) <https://www.gov.uk/government/publications/unlocking-the-value-of-data-exploring-the-role-of-data-intermediaries>.

^[76] Marina Micheli and others, ‘Emerging Models of Data Governance in the Age of Datafication’ (2020) 7 Big Data & Society <https://journals.sagepub.com/doi/epub/10.1177/2053951720948087> accessed 11 June 2024.

^[77] Stefaan Verhulst, Andrew Young and Prianka Srinivasan, ‘An Introduction to Data Collaboratives’.

^[78] ‘What Does It Mean? | Shifting Power Through Data Governance’ (n 50).

^[79] https://www.californialawreview.org/print/data-unions-the-need-for-informational-democracy

^[80] Kathryn S Quick and Martha S Feldman, ‘Distinguishing Participation and Inclusion’ (2011) 31 Journal of Planning Education and Research 282.

^[81] Gijs van Maanen, Charlotte Ducuing and Tommaso Fia, ‘Data Commons’ (2024) 13 Internet Policy Review <https://policyreview.info/glossary/data-commons> accessed 1 December 2024.

^[82] For further reading see Lee Anne Fennell, ‘Ostrom’s Law: Property Rights in the Commons’ (2011) 5 International Journal of the Commons <https://thecommonsjournal.org/articles/10.18352/ijc.252> accessed 13 November 2024.

^[83] Micheli and others (n 76).

^[84] Shivam Soni, ‘Data Stewardship – A Taxonomy’ (The Data Economy Lab, 24 June 2020) <https://thedataeconomylab.com/2020/06/24/data-stewardship-a-taxonomy/> accessed 1 December 2024.

^[85] ‘Participatory Data’ (The ODI, 5 July 2023) <https://theodi.org/insights/projects/participatory-data/> accessed 1 December 2024.

^[86] Ingrid Schneider, ‘Data Stewardship by Data Trusts: A Promising Model for the Governance of the Data Economy?’ in Claudia Padovani and others (eds), Global Communication Governance at the Crossroads (Springer International Publishing 2024) <https://doi.org/10.1007/978-3-031-29616-1_19> accessed 3 November 2024.

^[87] Heleen Janssen and Jatinder Singh, ‘Data Intermediary’ (2022) 11 Internet Policy Review <https://policyreview.info/glossary/data-intermediary> accessed 1 December 2024.

^[88] Centre for Data Ethics and Innovation (n 63).

^[89] This framing uses the same terms but is distinct from previous framings of top-down in relation to laws and regulations, and bottom-up in relation to power differentials in data stewardship mechanisms (particularly data trusts – see Sylvie Delacroix and Neil D Lawrence, ‘Bottom-up Data Trusts: Disturbing the “One Size Fits All” Approach to Data Governance’ (2019) 9 International Data Privacy Law 236), and more aligned to research that describes the development of active and reflexive agency of publics (see Kennedy, Helen, and Giles Moss. ‘Known or Knowing Publics? Social Media Data Mining and the Question of Public Agency’. Big Data & Society 2, no. 2).

^[90] Micheli and others (n 76).

^[91] Andrew J Hawkins, ‘Alphabet’s Sidewalk Labs Shuts down Toronto Smart City Project’ (The Verge, 7 May 2020) <https://www.theverge.com/2020/5/7/21250594/alphabet-sidewalk-labs-toronto-quayside-shutting-down> accessed 13 November 2024.

^[92] Claire Burgoyne, ‘Investment in Smart Data Services’ (Smart Data Research UK, 17 October 2024) <https://www.sdruk.ukri.org/2024/10/17/22-million-for-new-smart-data-services/> accessed 1 December 2024.

^[93] ‘The Public’ (Smart Data Research UK) <https://www.sdruk.ukri.org/our-work/the-public/> accessed 1 December 2024.

^[94] ‘Labour Party Manifesto’ (The Labour Party) (2024). Available at <https://labour.org.uk/wp-content/uploads/2024/06/Labour-Party-manifesto-2024.pdf#page=35> accessed 1 December 2024.

^[95] ‘Fostering Participatory Data Stewardship | Aapti Institute’ (n 34).

^[96] See Part 1 of the proposed Data (Use and Access) Bill as introduced on 23 October 2024. Available at <https://bills.parliament.uk/publications/56527/documents/5211>.

^[97] ‘Common European Data Spaces | Shaping Europe’s Digital Future’ <https://digital-strategy.ec.europa.eu/en/policies/data-spaces> accessed 13 November 2024.

^[98] ‘Data Governance Act Explained | Shaping Europe’s Digital Future’ <https://digital-strategy.ec.europa.eu/en/policies/data-governance-act-explained> accessed 12 November 2024.

^[99] ‘Data Stewardship and the Role of National Statistical Offices in the New Data Ecosystem | UNECE’ <https://unece.org/info/publications/pub/387294> accessed 13 November 2024.

^[100] ‘International Approaches to Data Trusts: Recent Policy Developments from India, Canada and the EU’ (Data Trusts Initiative) <https://datatrusts.uk/blogs/international-policy-developments> accessed 1 December 2024.

^[101] ‘A Data Stewardship Framework for NZ – Data.Govt.Nz’ <https://www.data.govt.nz/toolkit/data-stewardship/a-data-stewardship-framework-for-nz/> accessed 21 May 2024.

^[102] Manuela Lenk and Matthias Steffen ‘Data Stewardship: State of the work in Switzerland’ (UNECE) <https://unece.org/sites/default/files/2022-11/S2_1_Switzerland_DataStewardship_MLenk_MSteffen.pdf> accessed 21 May 2024.

^[103] See: Department for Digital, Culture, Media & Sport (DCMS). (2021). Data: A new direction, Section 7. Available at: <https://www.gov.uk/government/consultations/data-a-new-direction> accessed 21 May 2024.

^[104] Global Partnership on AI, OECD, Aapti Institute and Open Data Institute ODI) ‘Enabling Data Sharing for Public Benefit through Data Trusts’ <https://gpai.ai/projects/data-governance/data-trusts/enabling-data-sharing-for-social-benefit-through-data-trusts.pdf> accessed 12 May 2024, 110.

^[105] Quick and Feldman (n 80) 281.

^[106] Rie Kamikubo, Kyungjun Lee and Hernisa Kacorri, ‘Contributing to Accessibility Datasets: Reflections on Sharing Study Data by Blind People’, Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (ACM 2023) <https://dl.acm.org/doi/10.1145/3544548.3581337> accessed 20 March 2024.

^[107] Erna Ruijer and others, ‘Open Data Work for Empowered Deliberative Democracy: Findings from a Living Lab Study’ (2024) 41 Government Information Quarterly 101902.

^[108] ‘Inclusive Data Taskforce Recommendations Report: Leaving No One behind – How Can We Be More Inclusive in Our Data?’ (UK Statistics Authority) <https://uksa.statisticsauthority.gov.uk/publication/inclusive-data-taskforce-recommendations-report-leaving-no-one-behind-how-can-we-be-more-inclusive-in-our-data/> accessed 1 December 2024.

^[109] Micheli and others (n 76).

^[110] See Mozilla. Data stewardship landscape scan v2. Available at <https://airtable.com/appuIgRGMn2cuLNo0/shrn9jnFOQByon2i7/tbl6wGsRuAebCHqlz/viwG9gCGG3V91xjDw?blocks=hide>.

^[111] Stefaan G Verhulst Verhulst, ‘Wanted: Data Stewards – (Re-)Defining the Roles and Responsibilities of Data Stewards for an Age of Collaboration’ (GovLab 2020) <https://www.issuelab.org/permalink/download/40383> accessed 6 March 2024.

^[112] Delacroix and Lawrence (n 60).

^[113] Quick and Feldman (n 80) 282.

^[114] Overview of GDPR principles in Appendix of Exploring Legal Mechanisms for Data Stewardship). Ada Lovelace Institute and UK AI Council (n 5).

^[115] DCMS, ‘Data: A New Direction – Government Response to Consultation’ Section 7 (23 June 2022) <https://www.gov.uk/government/consultations/data-a-new-direction/outcome/data-a-new-direction-government-response-to-consultation> accessed 30 June 2024.

^[116] ibid. Section 1.7.

^[117] Expert Participation, ‘Digital Markets, Competition and Consumers Act 2024’ <https://www.legislation.gov.uk/ukpga/2024/13/contents> accessed 13 November 2024.

^[118] ‘Online Safety Act 2023’ <https://www.legislation.gov.uk/ukpga/2023/50/contents> accessed 6 December 2024.

[119] For a detailed mapping of different participatory mechanisms to objectives of informing, consulting, involving, collaborating and empowering, see Ada Lovelace Institute (n 2) 16-17.

^[120] This description is drawn from Arnstein’s ladder of participation, see Arnstein (n 3).

^[121] This articulation of what people can expect is drawn from the IAP2 and Ada Lovelace Institute’s spectrums of public participation

^[122] ‘Algorithmic Transparency Recording Standard Hub’ (GOV.UK, 7 March 2024) <https://www.gov.uk/government/collections/algorithmic-transparency-recording-standard-hub> accessed 12 November 2024.

^[123] ‘Public Sector Consultations’ (Local Government Lawyer, 28 November 2023) <https://localgovernmentlawyer.co.uk/governance/314-governance-a-risk-articles/55748-public-sector-consultations> accessed 12 November 2024.

^[124] ‘Section 4: Consulting Residents | Local Government Association’ <https://www.local.gov.uk/our-support/communications-and-community-engagement/resident-communications/understanding-views-2> accessed 12 November 2024.

^[125] Communication from the Commission – Commission Guidelines for providers of Very Large Online Platforms and Very Large Online Search Engines on the mitigation of systemic risks for electoral processes pursuant to Article 35(3) of Regulation (EU) 2022/2065 2024.

^[126] NHS England, ‘NHS England » Co-Production’ <https://www.england.nhs.uk/always-events/co-production/> accessed 12 October 2024.

^[127] See p. 34 of the 2017 five year forward view https://www.england.nhs.uk/wp-content/uploads/2017/03/NEXT-STEPS-ON-THE-NHS-FIVE-YEAR-FORWARD-VIEW.pdf

^[128] Data Economy Lab ‘Stewardship Navigator database (Airtable) <https://airtable.com/appF62QIgSpXGFGbf/shrH2IvivQ0ughB94/tblHfyAY7elk1pIux?> accessed 15 May 2024.

^[129] Mozilla, ‘Data Stewardship Literature Catalog’ (Airtable) <https://airtable.com/appC1r9c6VxJ7I8oI/shrrFNH3DObwYrlbU/tblyKb4qZ5VYaE0lu/viwyJ98yPY30fv2gs?backgroundColor=blue&blocks=hide> accessed 15 May 2024.

^[130] ‘The Data Institutions Register’ (Airtable) <https://airtable.com/appoMGboO9hE6PJ9w/shrcAnkPGmlzW3YgD/tblptl5NonXJHsPOc/viwdfMis8J0Z4v6uU> accessed 15 May 2024.

^[131] Patricia Wilson and others, ‘Rese a rch with Patient and Public I nv o lvement: A Realis t Evaluation – the RAPPORT Study’ (2015) 3 Health Services and Delivery Research 1.

^[132] Note that this review forms part of a programme of work with Digital Good Network, who will produce the review of implemented projects.

^[133] ‘A Data Stewardship Framework for NZ – Data.Govt.Nz’ <https://www.data.govt.nz/toolkit/data-stewardship/a-data-stewardship-framework-for-nz/> accessed 21 May 2024.

^[134] ‘Accelerating Medicines Partnership (AMP)’ (National Institutes of Health (NIH)) <https://www.nih.gov/research-training/accelerating-medicines-partnership-amp> accessed 21 May 2024.

^[135] ‘Data Driven Detroit | Data Driven Detroit’ <https://datadrivendetroit.org/> accessed 21 May 2024.

^[136] ‘Vision Philadelphia – Equity. Innovation. Leadership.’ <https://visionphiladelphia.org/> accessed 21 May 2024.

^[137] Francois van Schalkwyk, Alexander Andrason and Gustavo Magalhaes, ‘A New Harvest: A Review of the Literature on Data Ownership Focusing on the Agricultural Sector’ (29 October 2018) <https://papers.ssrn.com/abstract=3379530> accessed 17 May 2024.

^[138] Martin Boeckhout, Gerhard A Zielhuis and Annelien L Bredenoord, ‘The FAIR Guiding Principles for Data Stewardship: Fair Enough?’ (2018) 26 European Journal of Human Genetics 931.

^[139] Victoria Reyes-García and others, ‘Data Sovereignty in Community-Based Environmental Monitoring: Toward Equitable Environmental Data Governance’ (2022) 72 BioScience 714.

^[140] Stephanie Russo Carroll, Desi Rodriguez-Lonebear and Andrew Martinez, ‘Indigenous Data Governance: Strategies from United States Native Nations’ (2019) 18 Data science journal 31.

^[141] ‘Digital Earth Africa’ <https://www.digitalearthafrica.org/> accessed 22 May 2024.

^[142] ‘Accelerating Medicines Partnership (AMP)’ (National Institutes of Health (NIH)) <https://www.nih.gov/research-training/accelerating-medicines-partnership-amp> accessed 21 May 2024.

^[143] Stefaan Verhulst, Andrew Young and Prianka Srinivasan, ‘An Introduction to Data Collaboratives’.

^[144] Stefaan Verhulst, Andrew Young and Prianka Srinivasan, ‘An Introduction to Data Collaboratives’.

^[145] Discrepancy in N for this figure is due to entries being coded under multiple data governance structures.

^[146] ‘Brixham Data Trust’ (Data Trusts Initiative) <https://datatrusts.uk/pilot-brixham> accessed 20 May 2024.

^[147] ‘Born In Scotland’ (Data Trusts Initiative) <https://datatrusts.uk/pilot-bis> accessed 20 May 2024.

^[148] ‘Building a Place Based Data Trust for People and Planet – Building a Place Based Data Trust for People and Planet’ <https://thisisplace.org/> accessed 20 May 2024.

^[149] ‘Home – Open Humans’ <https://www.openhumans.org/> accessed 20 May 2024.

^[150] ‘Driver’s Seat Cooperative’ (Driver’s Seat Cooperative) <https://www.driversseat.co> accessed 20 May 2024.

^[151] ‘UK Biobank – UK Biobank’ (2 May 2024) <https://www.ukbiobank.ac.uk> accessed 20 May 2024.

^[152] ‘About Us’ (Answer ALS) <https://www.answerals.org/about-us/> accessed 20 May 2024.

^[153] ‘Home – OurBrainBank for Glioblastoma’ <https://www.ourbrainbank.org/> accessed 20 May 2024.

^[154] ‘Participation Model’ <https://www.agrefed.org.au/ParticipationModel> accessed 7 November 2024.

^[155] ‘Home · Solid’ <https://solidproject.org/> accessed 20 May 2024.

^[156] ‘Home’ (n 17).

^[157] Christian Boudreau, ‘Reuse of Open Data in Quebec: From Economic Development to Government Transparency’ (2021) 87 International Review of Administrative Sciences 855.

^[158] ‘Data Driven Detroit | Data Driven Detroit’ (n 42).

^[159] ‘CTData’ (CTData) <https://www.ctdata.org> accessed 21 May 2024.

^[160] ‘About | Civic Switchboard Guide’ (n 54).

^[161] Verhulst (n 111).

^[162] Marina Micheli, ‘The Governance of Personal Data for the Public Interest: Research Insights and Recommendations’ in Shaun Topham, Paolo Boscolo and Michael Mulquin, Personal Data-Smart Cities: How cities can Utilise their Citizen’s Personal Data to Help them Become Climate Neutral (1st edn, River Publishers 2023) <https://www.taylorfrancis.com/books/9781003399384/chapters/10.1201/9781003399384_14> accessed 20 March 2024.

^[163] Kieron O’Hara, ‘Data Trusts: Ethics, Architecture and Governance for Trustworthy Data Stewardship’ (Web Science Institute 2019) <https://eprints.soton.ac.uk/428276/> accessed 20 March 2024.

^[164] Rie Kamikubo, Kyungjun Lee and Hernisa Kacorri, ‘Contributing to Accessibility Datasets: Reflections on Sharing Study Data by Blind People’, Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (ACM 2023) <https://dl.acm.org/doi/10.1145/3544548.3581337> accessed 20 March 2024.

^[165] ‘Data Ownership, Rights and Controls: Seminar Report’ (The British Academy) <https://www.thebritishacademy.ac.uk/publications/data-ownership-rights-controls-seminar-report/> accessed 20 March 2024.

^[166] Hampapuram Ramapriyan and Jeanne Behnke, ‘Importance and Incorporation of User Feedback in Earth Science Data Stewardship’ (2019) 18 Data Science Journal 24.

^[167] Nisha Shah and others, ‘Sharing Data for Future Research-Engaging Participants’ Views about Data Governance beyond the Original Project: A DIRECT Study’ (2019) 21 Genetics in Medicine: Official Journal of the American College of Medical Genetics 1131.

^[168] ‘Co-Creation of the Digital Democracy and … | Open Research Europe’ <https://open-research-europe.ec.europa.eu/articles/4-45> accessed 29 May 2024.

^[169] Wellcome, Aapti Institute, and Sage Bionetworks, ‘Ethical Data Governance for Mental Health Databanks: A Framework for Risk Diagnosis and Mitigation Strategies’ (Wellcome Trust 2023) <https://wellcome.org/grant-funding/guidance/mental-health-databanks?trk=feed_main-feed-card_reshare_feed-article-content>.

^[170] Sara Marcucci and others, ‘Informing the Global Data Future: Benchmarking Data Governance Frameworks’ (2023) 5 Data & Policy e30.m

^[171] Francesca Bria and others, ‘Governing Urban Data for the Public Interest’ (The New Institute 2023).

^[172] Michael Boniface and others, ‘The Social Data Foundation Model: Facilitating Health and Social Care Transformation through Datatrust Services’ (2022) 4 Data & Policy e6.

^[173] Christian Wendelborn, Michael Anger and Christoph Schickhardt, ‘What Is Data Stewardship? Towards a Comprehensive Understanding’ (2023) 140 Journal of Biomedical Informatics 104337.

^[174] Paul Box and others, Guidelines for the Development of a Data Stewardship and Governance Framework for the Agricultural Research Federation (AgReFed) Version 1.1. CSIRO (2019).

^[175] Michael Boniface and others, ‘A Blueprint for a Social Data Foundation: Accelerating Trustworthy and Collaborative Data Sharing for Health and Social Care Transformation’ (University of Southampton 2020) <https://southampton.ac.uk/~assets/doc/wsi/WSI%20white%20paper%204%20social%20data%20foundations.pdf> accessed 21 March 2024.

^[176] Darren Sharp and others, ‘A Participatory Approach for Empowering Community Engagement in Data Governance: The Monash Net Zero Precinct’ (2022) 4 Data & Policy e5.

^[177] ‘Where Is Participatory Data Now?’ (The ODI, 26 September 2023) <https://theodi.org/news-and-events/blog/where-is-participatory-data-now/> accessed 6 November 2023.

^[178] Francesca Bria and others, ‘Governing Urban Data for the Public Interest’ (The New Institute 2023).

^[179] Cystic Fibrosis Trust, ‘UK Cystic Fibrosis Registry’ <https://www.cysticfibrosis.org.uk/about-us/uk-cf-registry> accessed 22 May 2024.

^[180] Public Health Scotland, ‘What Is eDRIS? – Overview – Electronic Data Research and Innovation Service (eDRIS) – Data Research and Innovation Services – Services – Public Health Scotland’ <https://publichealthscotland.scot/services/data-research-and-innovation-services/electronic-data-research-and-innovation-service-edris/overview/what-is-edris/> accessed 22 May 2024.

^[181] Migrants Resilience Collaborative, ‘Home | Migrants Resilience Collaborative’ <https://migrantresilience.org> accessed 22 May 2024.

^[182] ‘Accelerating Medicines Partnership (AMP)’ (n 43).

^[183] Kaiser Permanente Research Bank, ‘Kaiser Permanente Research Bank – Kaiser Permanente’ (Kaiser Permanente Research Bank) <https://researchbank.kaiserpermanente.org/> accessed 22 May 2024.

^[184] HDR UK, ‘INSIGHT – Our Hub for Eye Health’ <https://www.hdruk.org/helping-with-health-data/health-data-research-hubs/insight/> accessed 22 May 2024.

^[185] MIDATA, ‘MIDATA Cooperative’ (MIDATA) <https://www.midata.coop/en/cooperative/> accessed 22 May 2024.

^[186] ‘Digital Earth Africa’ <https://www.digitalearthafrica.org/> accessed 22 May 2024.

^[187] National Institutes of Health (NIH), ‘All of Us Research Program | National Institutes of Health (NIH)’ (All of Us Research Program | NIH, 6 January 2020) <https://allofus.nih.gov/future-health-begins-all-us> accessed 22 May 2024.

^[188] AMdEX, ‘About AMdEX – What is AMdEX and why is data sharing crucial?’ (AMdEX) <https://amdex.eu/about/> accessed 22 May 2024.

^[189] ‘Datafund – Reclaim Your Data, Reclaim Your Freedom’ (Datafund – Reclaim your data, reclaim your freedom) <https://www.datafund.io/> accessed 21 May 2024.

^[190] Blasimme, Vayena and Hafen (n 48).

^[191] Igor Calzada, ‘Data Co-Operatives through Data Sovereignty’ (2021) 4 Smart Cities 1158.

^[192] Michael Max Bühler and others, ‘Unlocking the Power of Digital Commons: Data Cooperatives as a Pathway for Data Sovereign, Innovative and Equitable Digital Communities’ (2023) 3 Digital 146.

^[193] Elettra Bietti and others, ‘Data Cooperatives in Europe: A Legal and Empirical Investigation’ (2021) White Paper <https://cyber.harvard.edu/sites/default/files/2022-02/Data_Cooperatives_Europe-group2.pdf> accessed 25 March 2024.

^[194] ‘Homepage’ (JoinData) <https://join-data.nl/en/homepage/> accessed 22 May 2024.

^[195] ‘Driver’s Seat Cooperative’ (n 36).

^[196] Astha Kapoor and others, ‘Exploring the Potential of Data Stewardship in the Migration Space | German Marshall Fund of the United States’ (2022) <https://www.gmfus.org/news/exploring-potential-data-stewardship-migration-space> accessed 22 May 2024.

^[197] ‘Fostering Participatory Data Stewardship | Aapti Institute’ <https://aapti.in/fostering-participatory-data-stewardship/> accessed 6 November 2023.

^[198] Ada Lovelace Institute (n 2).

^[199] ‘Home’ (n 17).

^[200] Debora Irene Christine and Mamello Thinyane, ‘Citizen Science as a Data-Based Practice: A Consideration of Data Justice’ (2021) 2 Patterns 100224.

^[201] Micheli (n 64).

^[202] Rosie Alegado, Katy Hintzen and Sara Kahanamoku, ‘Kūlana Noiʻi: Indigenous Data Stewardship in Hawaiʻi’ <https://hdl.handle.net/10125/104896> accessed 21 May 2024.

^[203] National Institutes of Health (NIH), ‘All of Us Research Program | National Institutes of Health (NIH)’ (All of Us Research Program | NIH, 6 January 2020) <https://allofus.nih.gov/future-health-begins-all-us> accessed 22 May 2024.

^[204] Kalinda E Griffiths and others, ‘Indigenous and Tribal Peoples Data Governance in Health Research: A Systematic Review’ (2021) 18 International Journal of Environmental Research and Public Health 10318.

[205] Exploring Legal Mechanisms for Data Stewardship <https://www.adalovelaceinstitute.org/report/legal-mechanisms-data-stewardship/> accessed 13 December 2024

^[206] Ada Lovelace Institute (n 2).

^[207] Arnstein (n 3).

^[208] Christian Wendelborn, Michael Anger and Christoph Schickhardt, ‘What Is Data Stewardship? Towards a Comprehensive Understanding’ (2023) 140 Journal of Biomedical Informatics 104337.

^[209] Joint Research Centre (European Commission) and others (n 23).

^[210] Quick and Feldman (n 80).

^[211] Vinay Narayan and Joe Massey, ‘Co-Designing Data Trusts for Climate Action’ (Co-designing data trusts for climate action) <https://datatrusts.uk/blogs/co-designing-data-trusts-for-climate-action> accessed 22 May 2024.

^[212] ‘African Data Trusts: New Tools towards Collective Data Governance?’ (n 44).

^[213] Rediet Abebe and others, ‘Narratives and Counternarratives on Data Sharing in Africa’, Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Association for Computing Machinery 2021) <https://doi.org/10.1145/3442188.3445897> accessed 9 May 2024.

^[214] Jacqueline Hicks, ‘The Future of Data Ownership: An Uncommon Research Agenda’ (2023) 71 The Sociological Review 544.

^[215] Ada Lovelace Institute (n 1).

^[216] Joint Research Centre (European Commission) and others (n 23).

^[217] Centre for Data Ethics and Innovation (n 63).

^[218] Regulation (EU) 2022/868 of the European Parliament and of the Council of 30 May 2022 on European data governance and amending Regulation (EU) 2018/1724 (Data Governance Act) (Text with EEA relevance) 2022 (OJ L).

^[219] ‘Data Governance Act Explained | Shaping Europe’s Digital Future’ (n 98).

^[220] See Article 10 of the Data Governance Act. Available at <https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32022R0868>.

^[221] ‘EU Register of Data Intermediation Services | Shaping Europe’s Digital Future’ <https://digital-strategy.ec.europa.eu/en/policies/data-intermediary-services> accessed 12 December 2024.

^[222] See Dataspace Europe Oy,Smarter Contracts andiGrant.io. Notification available at <https://digital-strategy.ec.europa.eu/en/policies/data-intermediary-services>.

^[223] See European Commission Health Data Space announcements. Available at <https://health.ec.europa.eu/ehealth-digital-health-and-care/european-health-data-space_en> and <https://ec.europa.eu/commission/presscorner/detail/en/ip_24_2250>.

^[224] See European Council Health Data Space announcement. Available at <https://www.consilium.europa.eu/en/press/press-releases/2024/03/15/european-health-data-space-council-and-parliament-strike-provisional-deal/>.

^[225] ‘New EU Rules Make High-Value Datasets Available to Fuel Artificial Intelligence and Data-Driven Innovation | Shaping Europe’s Digital Future’ (10 June 2024) <https://digital-strategy.ec.europa.eu/en/news/new-eu-rules-make-high-value-datasets-available-fuel-artificial-intelligence-and-data-driven> accessed 12 October 2024.

^[226] ‘Data Act | Shaping Europe’s Digital Future’ <https://digital-strategy.ec.europa.eu/en/policies/data-act> accessed 12 October 2024.

^[227] Arnstein (n 3).

^[228] ‘Core Values, Ethics, Spectrum – The 3 Pillars of Public Participation – International Association for Public Participation’ <https://www.iap2.org/page/pillars> accessed 4 May 2022.

^[229] Ada Lovelace Institute (n 2).

^[230] Arnstein (n 3).

^[231] IAP2, ‘IAP2 Spectrum of Public Participation’ <https://iap2.org.au/wp-content/uploads/2020/01/2018_IAP2_Spectrum.pdf> accessed 30 September 2024.

[232] ‘A Practical Framework for Applying Ostrom’s Principles to Data Commons Governance – Mozilla Foundation’ <https://foundation.mozilla.org/en/blog/a-practical-framework-for-applying-ostroms-principles-to-data-commons-governance/> accessed 13 December 2024

^[233] Anouk Ruhaak and others, ‘A Practical Framework for Applying Ostrom’s Principles to Data Commons Governance’ (6 December 2021) <https://foundation.mozilla.org/en/blog/a-practical-framework-for-applying-ostroms-principles-to-data-commons-governance/> accessed 21 May 2024.

[234] ‘Deed – Attribution-NonCommercial-ShareAlike 2.0 Generic – Creative Commons’ <https://creativecommons.org/licenses/by-nc-sa/2.0/> accessed 13 December 2024

^[235] ‘A Data Stewardship Framework for NZ – Data.Govt.Nz’ (n 101).

[236] ‘Deed – Attribution 4.0 International – Creative Commons’ <https://creativecommons.org/licenses/by/4.0/> accessed 13 December 2024

^[237] ‘Where Is Participatory Data Now?’ (The ODI, 26 September 2023) <https://theodi.org/news-and-events/blog/where-is-participatory-data-now/> accessed 6 November 2023.

^[238] ‘What Makes Participatory Data Initiatives Successful?’ (The ODI, 24 June 2024) <https://theodi.org/insights/reports/what-makes-participatory-data-initiatives-successful/> accessed 12 November 2024.

^[239] ibid.

^[240] ‘Fostering Participatory Data Stewardship | Aapti Institute’ (n 34).

[241] Wilkinson MD and others, ‘The FAIR Guiding Principles for Scientific Data Management and Stewardship’ (2016) 3 Scientific Data 160018

^[242] Christian Wendelborn, Michael Anger and Christoph Schickhardt, ‘What Is Data Stewardship? Towards a Comprehensive Understanding’ (2023) 140 Journal of Biomedical Informatics 104337 2.

^[243] ‘FAIR Principles’ (GO FAIR) <https://www.go-fair.org/fair-principles/> accessed 12 December 2024.

[244] ‘CARE Principles’ (Global Indigenous Data Alliance, 23 January 2023) <https://www.gida-global.org/care> accessed 13 December 2024

^[245] Stephanie Russo Carroll and others, ‘Operationalizing the CARE and FAIR Principles for Indigenous Data Futures’ (2021) 8 Scientific Data 108.

^[246] Stephanie Russo Carroll and others, ‘The CARE Principles for Indigenous Data Governance’ (2020) 19 Data Science Journal <https://datascience.codata.org/articles/10.5334/dsj-2020-043> accessed 12 November 2024.

^[247] Stephanie Russo Carroll and others, ‘Extending the CARE Principles from Tribal Research Policies to Benefit Sharing in Genomic Research’ (2022) 13 Frontiers in Genetics <https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2022.1052620/full> accessed 12 November 2024.

[248] ‘Deed – Attribution-NonCommercial-NoDerivatives 4.0 International – Creative Commons’ <https://creativecommons.org/licenses/by-nc-nd/4.0/> accessed 13 December 2024

^[249] Joint Research Centre (European Commission) and others (n 23).

[250] ‘Deed – Attribution 4.0 International – Creative Commons’ <https://creativecommons.org/licenses/by/4.0/deed.en> accessed 13 December 2024

^[251] ‘About’ (Connecticut Data Collaborative) <https://www.ctdata.org/about> accessed 16 May 2024.

^[252] World Economic Forum, ‘Advancing Digital Agency: The Power of Data Intermediaries’ (2022) <https://www.weforum.org/publications/advancing-digital-agency-the-power-of-data-intermediaries/> accessed 16 May 2024.

^[253] ‘Cozy Cloud – A Personal Cloud to Gather All Your Data’ <https://cozy.io/en/> accessed 16 May 2024.

^[254] ‘Home’ (BitsAboutMe) <https://bitsabout.me/en/> accessed 16 May 2024.

^[255] ‘About Us’ (Hestia AI) <https://hestia.ai/about-us> accessed 16 May 2024.

^[256] Camden Council, ‘Camden Council’s Data Charter’ (2023) <https://www.camden.gov.uk/data-charter> accessed 15 May 2024.

^[257] ‘The CityVerve Project’ (Digital Futures) <https://www.digitalfutures.manchester.ac.uk/about-us/case-studies/cityverve/> accessed 15 May 2024.

^[258] ‘Genomics England’ (Genomics England, 1 March 2023) <https://www.genomicsengland.co.uk/about-us> accessed 15 May 2024.

^[259] ‘ISI Foundation’ (ISI Foundation) <https://www.isi.it/> accessed 16 May 2024.

^[260] ‘Data and Evidence to Advance Upward Mobility and Equity’ (Urban Institute) <https://www.urban.org/> accessed 16 May 2024.

^[261] ‘The Information Accountability Foundation’ (The Information Accountability Foundation) <https://informationaccountability.org/> accessed 16 May 2024.

^[262] ‘Home’ (Jan Sahas Foundation) <https://jansahas.org/> accessed 16 May 2024.

^[263] ‘A Data Stewardship Framework for NZ – Data.Govt.Nz’ <https://www.data.govt.nz/toolkit/data-stewardship/a-data-stewardship-framework-for-nz/> accessed 21 May 2024.

^[264] ‘Accelerating Medicines Partnership (AMP)’ (National Institutes of Health (NIH)) <https://www.nih.gov/research-training/accelerating-medicines-partnership-amp> accessed 21 May 2024.

^[265] ‘Data Driven Detroit | Data Driven Detroit’ <https://datadrivendetroit.org/> accessed 21 May 2024.

^[266] ‘Vision Philadelphia – Equity. Innovation. Leadership.’ <https://visionphiladelphia.org/> accessed 21 May 2024.

^[267] Classifying literature into distinct categories was a challenge in itself, as sometimes judgements needed to be made on the extent to which a piece of work was proposing novel ideas, leaning on existing literature or analysing implemented projects.

^[268] ‘African Data Trusts: New Tools towards Collective Data Governance?’ <https://www.tandfonline.com/doi/epdf/10.1080/13600834.2023.2260678?needAccess=true> accessed 9 May 2024.

^[269] Alessandro Blasimme, Effy Vayena and Ernst Hafen, ‘Democratizing Health Research Through Data Cooperatives’ (2018) 31 Philosophy & Technology 473.

^[270] Gijs van Maanen and Annemarie Balvert, ‘Open for Whom?: The Role of Intermediaries in Data Publication’ (2019) 5 Publicum 129.

^[271] Masaharu Tsujimoto and Soichiro Tanaka, ‘Case Study of the Customer Acceptance of Personal Data Trust Bank in Japan: A Questionnaire Survey’, Handbook on Digital Business Ecosystems (Edward Elgar Publishing 2022) <https://www.elgaronline.com/display/edcoll/9781839107184/9781839107184.00050.xml> accessed 21 May 2024.

^[272] Albert Meijer and Suzanne Potjer, ‘Citizen-Generated Open Data: An Explorative Analysis of 25 Cases’ (2018) 35 Government Information Quarterly 613.

^[273] Kalinda E Griffiths and others, ‘Indigenous and Tribal Peoples Data Governance in Health Research: A Systematic Review’ (2021) 18 International Journal of Environmental Research and Public Health 10318.

^[274] Esther Thea Inau and others, ‘Initiatives, Concepts, and Implementation Practices of the Findable, Accessible, Interoperable, and Reusable Data Principles in Health Data Stewardship: Scoping Review’ (2023) 25 Journal of Medical Internet Research e45013.ini

^[275] ‘The DPI Approach: A Playbook’ (United Nations Development Programme 2023) <https://www.undp.org/publications/dpi-approach-playbook> accessed 21 May 2024.

^[276] Anouk Ruhaak and others, ‘A Practical Framework for Applying Ostrom’s Principles to Data Commons Governance’ (6 December 2021) <https://foundation.mozilla.org/en/blog/a-practical-framework-for-applying-ostroms-principles-to-data-commons-governance/> accessed 21 May 2024.

^[277] Parminder Jeet Singh and Jai Vipra, ‘Economic Rights Over Data: A Framework for Community Data Ownership’ (2019) 62 Development 53.

^[278] Sally Okun and others, ‘Commitments for Ethically Responsible Sourcing, Use, and Reuse of Patient Data in the Digital Age: Cocreation Process’ (2023) 25 Journal of Medical Internet Research e41095.

^[279] Erna Ruijer and others, ‘Open Data Work for Empowered Deliberative Democracy: Findings from a Living Lab Study’ (2024) 41 Government Information Quarterly 101902.

^[280] Patel, R. ‘Rethinking data: changing narratives, practices and perspectives’ (2020) Ada Lovelace Institute

Image credit: SolStock

Authors: Roshni Modhvadia

Octavia Field Reid

Foreword by Reema Patel, Digital Good Network

Executive summary

How to read this report

Introduction

Approach to the research

What are the current conditions in the landscape?

Data stewardship in practice

Challenges for participatory and inclusive data stewardship

Definitions

What is data stewardship?

Emerging mechanisms for participatory and inclusive data stewardship

Figure 1: Overview of data governance mechanisms

Table 1. Participatory data governance mechanisms

Table 2: Participation-supporting data governance mechanisms

Table 3. Non-participatory data governance or sharing mechanisms

Different purposes for data stewardship

Figure 2: Different and intersecting purposes of data stewardship mechanisms

What is the role of legislation?

What is the role of participatory and inclusive approaches?

Figure 3: Normative rules’ role in bridging legal mechanisms and relational practices

Analysis of legal and participatory mechanisms

Existing legal underpinnings for data stewardship

Frameworks for participatory and inclusive data stewardship

Purposes of participation

Figure 4: The spectrum of public participation purposes in relation to empowerment

Operational norms

Inclusion

Legal and participatory mechanisms for data stewardship

Table 4. Mapping legal and participatory mechanisms to objectives for data stewardship

Findings from the landscape review

1. Many organisations researching or implementing data stewardship mechanisms are academic institutions based in the west, or working in the domain of health data

Summary of features of the dataset (further detail in Appendix)

Figure 5: Map of organisations in dataset

How organisations describe themselves and their purpose

2. Data trusts were the most commonly described governance structures, followed by data cooperatives

Figure 6: Data governance structures identified

Data trusts

Data cooperatives

Data-sharing pools

Personal Information Management Systems

3. Participatory mechanisms identified were varied, and the relationship between participation and power may not be as linear as previously conceptualised

Figure 7: Levels of participation in the dataset

Informing

Consulting

Involving

Collaborating

Empowering

4. Mechanisms to support inclusive data stewardship were underdeveloped in the landscape review

Inclusive data stewardship mechanisms

Findings from interviews

1. Definitions and terms used in this field vary

2. Purpose matters to people’s comfort with different uses

3. Participation takes many shapes and forms in the data lifecycle

4. Mechanisms for inclusion are less developed than mechanisms around participation

5. Trusted intermediaries are a critical component of the ecosystem

6. Progress has been made in some areas of data stewardship

7. But barriers have slowed down progress in some areas of the landscape

8. Discourses are shifting from data to AI

Conclusion

Next steps

What can we say about participatory and inclusive data stewardship?

Observations about the landscape

Methodology

Legal and participatory analysis

Desk-based database review

Interviews

Methodological limitations

Acknowledgements

Appendix

1. Legal underpinnings for data stewardship: EU

2. Frameworks for participation

3. Detailed findings from the desk-based research

Geography

Type of organisations/institutes

Applications of data

Types of sources represented in our dataset

Figure 8: Types of theoretical literature

Related research from the Ada Lovelace Institute

Footnotes