Democratizing Data: Citizen Science and AI for Inclusive Policy Design

Author: GIZ (Deutsche Gesellschaft für Internationale Zusammenarbeit)

Editor's Note: In a world where data is expanding at an exponential rate, one of the most valuable sources of evidence can be people themselves. When collected responsibly and effectively, insights from citizens, whether through community reports, lived experiences or citizen science initiatives, can enrich official data, reveal overlooked realities and ensure that policies reflect the complexity of real-world challenges.

Democratizing Data: Citizen Science and AI for Inclusive Policy Design

Background: The Democratization of Science in Citizen Participation

Citizen Science (CS) is an innovative research methodology that promotes the participation of non-professional individuals in scientific investigations. Historically, scientific research has been time-intensive and largely overseen by academics and researchers working in institutes. CS challenges this standard by shifting all or some stage(s) of the research process to a more democratic setting. This may involve different modes of participation such as collecting data, analysing data, empowering others to contribute their knowledge, or fostering inclusive science.  

Bonney and Ballard defined the idea of Citizen Science as the active participation of non-researchers in scientific research. They also offered a categorization of the different CS models according to the degree of non-researcher participation (Bonney, Ballard et al., 2009), as depicted below.

Source: Shirk et al. Citation2012.

CS offers the potential for greater societal impact and democratization of science, ensuring that scientific research and its outcomes do not exist in a vacuum, but engage with the very communities being studied. This approach empowers citizens to contribute towards the creation of academic knowledge (Bonney et al., 2009) and makes science more responsive to their needs (Irwin, 1995).  

More recently, the Copenhagen Framework on Citizen Data has further refined this concept by defining “citizen data” as data stemming from initiatives where citizens are sufficiently engaged in the design and/or collection stages of the data process. This participation aims to increase inclusiveness, responsiveness and reflect diverse community needs.

Open Frameworks for Citizen Science

Numerous frameworks are available that classify and organize the different types of citizen participation in CS:  

The Multi-Dimensional Framework is used to evaluate CS initiatives based on their activities, knowledge exchange, resource mobilization and decision-making processes.

Source: https://www.tandfonline.com/doi/full/10.1080/13662716.2021.1976627

The STORCIT framework (below) shows how storytelling methods can be included in citizen science and make this approach more inclusive by broadening participation and embedding diverse voices.

Source: https://www.frontiersin.org/journals/environmental-science/articles/10.3389/fenvs.2023.1211213/full

Conducting scientific research through a bottom-up rather than a top-down approach has numerous benefits, such as fostering positive feedback loops. For example, when working directly with citizens, research questions can become more precise, hard-to-reach communities are easier to access and research outcomes and conclusions gain more relevance. In this way, citizen-generated data makes research more responsive and participative to public needs and has the potential to enable citizens to actively contribute to the creation of scholarly knowledge (Bonney et al., 2009; Irwin, 1995).

How AI-Powered Citizen Science Strengthens the Data Value Chain

The Data Value Chain framework employed by international non-profit organization Open Data Watch outlines a pathway for data collection, processing and preparation, transforming raw data into actionable information. Each step of this value chain can be enhanced through AI-powered CS. For example, data collection can be optimized through process automation; data processing and analysis can be accelerated with machine learning algorithms; and new tools enable real-time feedback on findings, ensuring continuous improvement and more efficient decision-making.

Source:  https://opendatawatch.com/wp-content/uploads/Data-Value-Chain/ODW-Data2x-Data-Value-Chain-simple-CC-BY-ATTRIBUTION.jpg

Citizen Science vs. Crowdsourced Data

Crowdsourced data is data gathered from large communities. These types of projects usually welcome open participation from a variety of contributors, including both experts and non-experts. However, in most crowdsourcing efforts, the public’s role is largely confined to data collection (Eitzel et al., 2021).

In contrast, citizen science (CS) fosters more collaborative and sustained involvement. Rather than limiting participation to data collection, CS typically includes activities such as analysis, interpretation, and even project development, with citizens working closely alongside scientists. This emphasis on ongoing, meaningful engagement distinguishes CS as a deliberate approach that encourages deeper interaction and cooperation throughout the research process.

AI-Powered Citizen Science (applications)

The concept of AI-powered CS involves leveraging AI to enhance the stages of CS initiatives, such as data collection, processing and analysis. AI applications enable stakeholders with limited expertise in research to play a meaningful role in gathering, evaluating and interpreting data. Key benefits of AI systems include their ability to analyse large volumes of data, identify patterns, and provide immediate feedback. Examples of AI applications that could enhance CS for its use in policy making include:

Natural Language Processing

The ability to make machines understand the written or spoken language of human beings is called Natural Language Processing (NLP). One of NLP’s functions is to assess large documents and volumes of text that can be transformed into data. This includes citizen reports or social media posts which can be analysed to extract information that guides policy decisions. NLP is a core strength of modern Large Language Models (LLMs) and is a foundational component of their functionality. Recent breakthroughs in transformer architecture—a critical element of LLMs—have significantly improved their ability to understand context and relationships between concepts, enabling more accurate and nuanced text analysis at scale.

Machine Learning

Another AI technology known as machine learning (ML), enables computers to learn from data and identify patterns without being explicitly programmed for a specific task. In the context of CS, machine learning algorithms can analyse data from citizens, such as environmental monitoring contributions, to uncover patterns and provide insights. These insights can support policymakers by highlighting trends or suggesting potential actions.

Deep Learning

Deep learning is a subset of AI and ML that focuses on training artificial neural networks with multiple layers to process and analyse complex patterns in data. It mimics the way the human brain learns, enabling systems to automatically extract features from raw data and make predictions and classifications. Deep learning excels in handling large and diverse datasets, such as images, audio, or text and is widely used in applications such as image recognition (e.g., satellite images), tracking changes in real time.

AI-Powered Citizen Science – Case Study Examples

Collaborating With a Non-Profit Organization in Indonesia to Reduce Deforestation

Climate change is one of the most pressing issues of our time and deforestation is a significant contributor to this problem. According to the World Resources Institute, Indonesia is one of the countries most affected by climate change and deforestation. Over 70% of the country’s forests have been lost or degraded over the past 50 years. This alarming loss of forest cover poses a significant threat not only to the environment and biodiversity but also to the livelihoods of local and Iindigenous communities who depend on the forests for their livelihoods.  

The High Carbon Stock Approach (HCSA), a non-profit organization with the mission to promote forest conservation and ecosystem protection, provides guidance on identifying, managing, and monitoring HCS levels in forests. This helps ensure that forest maps accurately reflect the conditions on the ground. By supporting responsible land use practices, HCSA also helps prevent deforestation, reduce greenhouse gas emissions, and safeguard the rights and livelihoods of local communities.  

With the support of the FAIR Forward initiative from GIZ, HCSA collaborated with the local mapping organization JKPP in Indonesia to gather critical forest biomass data, including Diameter at Breast Height (DBH) measurements, tree species identification, forest classification and biomass estimation, as well as socio-economic data. By leveraging AI, this rich dataset can significantly improve the accuracy of forest classification and monitoring. AI can analyse vast amounts of complex data to identify patterns and trends that might otherwise go unnoticed, enabling more precise mapping and assessment of forest conditions. These advancements can support targeted conservation efforts, enhance sustainable land-use planning, and contribute to more effective strategies for combating deforestation and mitigating climate change.  

The Approach  

This collaborative effort—led by HCSA in partnership with environmental consultancy Ekologika and JKPP—trained more than 40 participants from diverse regions across Indonesia and organizations to collect critical forest data, fostering local ownership and accuracy in forest mapping. By actively involving communities, the approach not only produces large-scale HCS/HCV maps for more informed, AI-driven decisions on deforestation prevention and sustainable supply chains but also holds strong potential for scaling across tropical regions to support smallholder farmers and broader, more inclusive land-use policies.

Democratizing Voices of Women for Policymaking in Mexico

Read the full use case

Women in Mexico City face significant barriers to economic participation, largely due to the unequal division of unpaid care work. This imbalance limits their opportunities for paid employment, exacerbating gender inequality in economic empowerment. The lack of comprehensive data on the challenges women face in balancing work and childcare has made it difficult to design effective policies to address the issue.

Using citizen input to understand the problem

To bridge this data gap, the city government, with support from GIZ, launched an initiative to gather insights directly from women. Through a participatory approach, data on women's needs, challenges, and limitations related to work and childcare was collected using digital platforms. This crowdsourced data provided a foundational understanding of the systemic barriers women face.

Transitioning to CS for policy recommendations

Building on the initial crowdsourcing effort, the project adopted a CS approach, involving women not just in sharing their experiences but in shaping policy recommendations. The project allowed women to contribute data and feedback directly, which was analysed alongside other data sources such as administrative, census and survey data to identify targeted solutions, such as the need for accessible childcare services. By actively engaging women in the process, the initiative fostered a deeper connection between citizens and policymakers, highlighting women’s barriers and needs and contributing to policies that reflect lived experiences and real-world challenges.

Scalability and sustainability

A key strength of this initiative is its adaptability and potential for scalability. The data platform integrates multiple sources, including citizen-generated data, to provide a dynamic and evolving understanding of women’s needs. This iterative approach ensures that policies remain relevant and responsive as societal circumstances change. Sustained progress towards gender equality will require ongoing collaboration between policymakers and citizens, as well as continued capacity- building to empower women and promote inclusive economic participation in Mexico City.

Enabling women’s economic empowerment: This project addresses the unequal division of unpaid care work in Mexico City, which hinders women's economic participation. The city government, with GIZ's support, gathered data and developed a platform to analyse it. Citizen input helped understand women's needs and challenges concerning work and childcare.

Citizen science powering policy recommendations: Citizen-sourced data provided insights into women's desires and limitations related to paid work. This "citizen science" approach helped identify key areas for policy, like accessible childcare. Platforms like ProsperIA and IncluIA allowed citizens to contribute directly, leading to targeted interventions designed to reduce the care burden on women.

Scalability and sustainability: The project's strength lies in its ability to adapt and grow. The data platform integrates various sources, including citizen input, to address current needs and inform future policies. This iterative approach ensures policies remain relevant as society evolves. Continued collaboration and capacity building for both policymakers and citizens are crucial to sustain progress towards gender equality and inclusive economic participation in Mexico City.

Leveraging Citizen Science to Monitor Marine Litter in Ghana  

Ghana faces a serious threat regarding plastic waste overflowing along its coastlines. This has harmful implications for marine life and coastal communities. The lack of official data on plastic pollution has historically made it difficult to track the issue and develop solutions.  This project sought to address the data gap by engaging a broad network of stakeholders, including government agencies, international organizations, and volunteers.

Crowdsourcing to fill the data gap

In the early stages, crowdsourced data played a pivotal role in tracking plastic waste. Thanks to the efforts of citizens, the Government of Ghana was able to create accurate, affordable, and up-to-date reports on marine litter. This data positioned Ghana a leader and among the first countries to use CS for official environmental reporting.

Moving towards Citizen Science

Since crowdsourced data usually limits public involvement to data extraction, the initiative shifted towards a citizen science (CS) model. This is characterized by a more collaborative approach from the various stakeholders, who may include volunteers, researchers, and policymakers. In this new model, volunteers not only gather data but also work alongside experts to analyse findings and co-create solutions. By transitioning from simply data gathering to a more active and collaborative process, the Ghana marine litter project adopted a “bottom-up” approach that improved marine pollution monitoring and reporting while inspiring broader use of citizen-generated data for policymaking. Notably, the Ghana Statistical Service (GSS) intends to apply similar CS methods to other areas, such as the Public Services Satisfaction (PSS) app, signaling a move toward more inclusive, data-driven policies.

Conclusion: Powering Citizen Science with AI for Data-Informed Policies

Integrating artificial intelligence with citizen science offers an opportunity to revolutionize policymaking making it more timely, accurate, multifaceted , and adaptive. This collaborative approach brings together the knowledge and expertise of citizens, academia and government, enhanced by AI tools that can pinpoint deeper evidence-based insights into population needs.  

This approach also empowers communities by establishing shared ownership over the process of data generation and analysis. It enables them to assert their presence, highlight areas they consider important—often overlooked or dismissed by traditional data systems—and integrate non-traditional data sources into the policy discourse. This fosters a sense of ownership and inclusion in shaping decisions that directly impact their lives. Furthermore, it facilitates the integration of diverse worldviews into decision-making processes, enriching outcomes with a broader and more inclusive perspective. In this wayway, policies are grounded in practical experience and tailored to the specific context of the communities they aim to serve.  

Ultimately, this concept aims to spark dialogue and secure funding for AI-based CS, opening the door to more participatory governance. By integrating data-driven approaches with meaningful public involvement, we can ensure that policies are not only informed by evidence but also genuinely beneficial and inclusive for all.

Click here to see a roadmap that can support you in your journey of using AI-powered citizen generated data for policymaking.

Reference

Collaborative on Citizen Data. (2024). The Copenhagen Framework on Citizen Data. Background document for the Fifty-fifth session of the United Nations Statistical Commission, 27 February –1 March 2024. https://unstats.un.org/statcom/session_55/documents/

Eitzel, M. V., Cappadonna, J. L., Santos-Lang, C., Duerr, R. E., Virapongse, A., West, S. E., Kyba, C. C. M., Bowser, A., Cooper, C. B., Sforzi, A., Metcalfe, A. N., Harris, E. S., Thiel, M., Haklay, M. M., Ponciano, L., Roche, J., Ceccaroni, L., Shirk, J. L., Evely, A. C., and Jiang, Q. (2017). Citizen science terminology matters: Exploring key terms. Citizen Science: Theory and Practice, 2(1), 1. https://doi.org/10.5334/cstp.96

Eitzel, M. V., Purdam, K., Schäfer, T., & Davies, L. (2023). Conceptualizing the citizen science data lifecycle. In S. Geertman, A. J. Miller, H. J. Miller, & Z. Yan (Eds.), Proceedings of the 18th International Conference on Geographic Information Science (pp. 8:1–8:4). Schloss Dagstuhl – Leibniz-Zentrum für Informatik.  

Grassini, L., Beigl, P., Malakhatka, E., and Schütz, H. (2023). The Storcit-framework: A framework for making citizen science inclusive with storytelling methods. Frontiers in Environmental Science, 11, Article 1211213. https://doi.org/10.3389/fenvs.2023.1211213

Kosmala, M., Wiggins, A., & Swanson, A. (2016). Machine Learning in Citizen Science: Promises and Implications. In Citizen Science: Innovation in Open Science, Society and Policy (pp. 1-18). Springer International Publishing. https://doi.org/10.1007/978-3-030-58278-4_10

Mascheroni, G., & Re, M. (2022). On citizen science communication and interaction: Fostering dialogue between science and society. Journal of Science Communication, 21(1), A07. https://doi.org/10.22323/2.21010207

Purdam, K., Kieslinger, B., Bonn, A., & Heigl, F. (2020). The spectrum of citizen science participation. Citizen Science: Theory and Practice. 5(1).  Reis, J. P., Arruda, J., Oliveira, J., & Cameirão, M. S. (2023). Exploring the impact of artificial intelligence in citizen science. Citizen Science: Theory and Practice, 8(1). https://doi.org/10.5334/cstp.584  

TIME4CS. (2023, September 12). Citizen Science & Artificial Intelligence Technologies: Collaborating for an innovative and unbiased future. https://www.time4cs.eu/events/citizen-science-and-artificial-intelligence-technologies-collaborating-for-an-innovative-and-unbiased-future  

United Nations. (2022, November). Citizens' contribution to data. https://unstats.un.org/sdgs/files/meetings/harnessing-data-by-citizens-for-public-policy-and-SDG-monitoring/Citizens-contribution-to-data-background-paper-202211.pdf

United Nations Statistics Division. (2023). The Copenhagen Framework on Citizen Data. United Nations. https://unstats.un.org/UNSDWebsite/statcom/session_55/documents/BG-4c-CGD_Framework-E.pdf

Veeckman, C., Claes, S., Van Audenhove, L., and Van Der Graaf, S. (2023). A framework for making citizen science inclusive with storytelling methods. Frontiers in Environmental Science, 11, 1-12. https://doi.org/10.3389/fenvs.2023.1211213

von Gönner, J., Herrmann, T. M., Bruckermann, T., et al., (2023). Citizen science’s transformative impact on science, citizen empowerment and socio-political processes. Socio-Ecological Practice Research 5, 11–33. https://doi.org/10.1007/s42532-022-00136-4

Resources

Checklist for AI Powered Citizen Data
Download