We don’t all need to be data scientists! Data capacities, in the context of policymaking, refers to the abilities of data infrastructure, systems and human resources to efficiently utilize data in decision making. It’s important for us to holistically improve these abilities among different teams so they complement skillsets and strengthen the overall data value chain. This article will dive into the ‘how’ of it with primary objectives of:
Enabling you to assess what data skills gaps exist;
Providing insights on what skills are relevant to different roles and responsibilities;
Detailing out best practices in improving abilities of government agencies to use their own data for evidence-based policymaking; and
Recommending resources on building these skills at the individual, organizational and systemic level to sustain capacity building efforts by creating a central repository of resources.
How to get started
Systematically building data capacities (for yourself and in your team) would mean starting at the very beginning, gaining an understanding of the general data landscape, familiarizing yourself with the strategies of your organization and mapping the existing data teams, data flows and guidelines when it comes to your policy challenge. If you haven’t already done this, the map your data ecosystem tool would be a great resource to get a broad understanding of the data landscape related to your policy challenge.
There’s a plethora of trainings available to equip your teams with data skills. However, as a policymaker, identifying who gets trained in data skills within your team (and perhaps organization) requires a thoughtful approach. Considering data needs and gaps related to your policy challenge, identifying gaps that are rising from limitations in skills and capacities of your team members followed by identifying key roles and the most effective way of further capacitating them would be your way forward.
I. Identify gaps in data skills and statistical capacities
As the first and foremost step, building data capacities requires assessing the current skill levels of the people involved in addressing the issue at hand. Building data skills for all people involved may be out of your scope of work, which is why it’s practical to start small, preferably with your immediate team and counterparts. Keep in mind that building capacities is a long and continuous process that requires a culture of evidence-informed policymaking to be fostered across government.
To identify an individual or team level skills gap, we recommend assessing the technical know-how and knowledge of processes and systems as well as problem-solving and creative thinking abilities to systematically use data in the policies.
The work know-how as well as problem-solving abilities are rather context dependent for the purpose of your issue. The technical know-how needed may also be variable, however, there are certain global best practices, standards and guidelines, which can support you in building effective skills in the teams. The data capability framework (DCF), developed by the Government of New Zealand, is one such tool for managing capabilities on the use of data at an individual, team and potentially organizational level. The framework (and other tools such as this one) are instrumental in:
Identifying current skills gaps and strengths of individuals and teams
Matching training opportunities to capacity building needs
Identifying future skills needs
Developing a talent pool
The framework has 25 capabilities, all relevant to how you would assess the skills in different teams.
Note: Not all capabilities are applicable for all issues. Similarly, not all capabilities would be relevant for all team members. For example, if there are designated personnel on the team for accessing data from different sources, the most relevant skills for those personnel would be to employ data management processes, ask for right data related to the problem and contribute to systematic improvement of data collection processes.
It would be important to identify which capability is suited for each team member, and more importantly, are there capabilities that none of the team members have. The framework categorizes each capability into different maturity levels: new, proficient and expert. New indicates having a basic understanding of the subject or process; Proficient indicates having enough experience to work independently and source additional expertise as needed; and Expert indicates the individual can innovate in the subject or process and guide others via mentoring and training.
In addition to the Data Capability Framework, there are tools that can help comprehensively assess how you and your team are equipped in terms of data maturity. For example, the Data Maturity Assessment (DMA) for Government has been created in the Government of UK specifically for use in the public sector. It’s a way to understand and identify strengths and weaknesses in an organization’s data ecosystem. While the framework covers several topics, the below sub-topics focus on assessing data skills and knowledge from a holistic perspective of culture, governance and data know-how at the leadership level. This framework takes an approach to self-assess from an organization’s perspective. However, it can be adapted to the level of teams dealing with specific challenges. Overall, the DMA is a very useful tool to assess institutional, organizational and individual skills levels.
II: Build data capacities in your team
Once you’ve identified the skill gaps within the team, it’s time to bridge the gap, either by upskilling/reskilling for existing team members or by expanding the teams to ensure that your organization possesses the right expertise for addressing all issues under their mandate.
What data skills are typically needed in teams?
An effective team ideally possesses a combination of technical and leadership skills when it comes to using data for decision making. While issue sector/theme-specific knowledge may be essential for certain roles, the below represents a sample of various data-related positions found in government organizations and the skills associated with each position. Note that the specific roles and titles, of course, vary in every country and context.
Types of skills
Collect, clean and analyse data for insights
Use statistical methods and data visualization techniques
Identify patterns, trends and correlations in data sets
Build data visualization tools such as data dashboards
Note: There may be specific types of data analysts for different data (e.g., satellite imagery)
Establish and implement data governance frameworks, policies and procedures
Ensure compliance with data protection regulations, define data quality standards and promote data security practices
Establish ethical standards and best practices for handling sensitive and personal data, including developing guidelines for anonymization, data de-identification, informed consent and data retention policies
Ensure the usage of data for the specific challenge/use case is within constitutional authority
Note: There may be more than one person handling these different functions depending on the issue
Identify data needs and objectives of the policy challenge
Collaborate with relevant stakeholders to ensure that the data collected aligns with policy priorities
Adhere to ethical guidelines and legal requirements in data collection and storage
Note: There’s no specific sequence in which these specialists need to be hired. This is dependent on the existing composition of your team and demands of the challenge you’re facing. Equally important is to know that having topical/sectoral knowledge is not optional. While hiring, consider personnel with combined technical and sectoral expertise or have technical personnel work very closely with the sectoral experts.
The overview above provides the most widely practiced responsibilities under the corresponding roles. However, when hiring or training staff with these skills, it’s essential to differentiate between who requires advanced data skills and who requires basic understanding of data concepts. It’s also important to keep in mind the differential needs within various stakeholders working on the same problem. For example, data analysts working at the federal/national level may have an additional task of data aggregation across various states or counties – as opposed to the state/regions/provinces or local level officials who may be dealing with different challenges around collecting and cleaning data with standard processes and procedures. Such differences in main roles and responsibilities should be reflected in the terms of reference you prepare for hiring or skilling.
Useful tip: Identifying individuals who already possess some level of data skills or related experience might be a good identifier of target trainees. Consider their potential for growth and their interest in developing data skills further. This way, internal talent can be nurtured and upskilled, which can be more cost-effective and beneficial in the long run.
Here is a potentially relevant resource from India where detailed terms of reference are laid out for a data and strategy unit. The terms of reference entails key roles, functions and departments as well as human resource requirements for the entire unit.
Similarly, additional skills around monitoring and evaluating the use of data and developing M&E systems for policies and their impact may be synonymous with data skills for certain teams. Therefore, it’s important to prioritize making the training decisions around who needs to be skilled, what skills are needed and when should the skilling take place based on the significance of impact on your policymaking processes. Focus on areas where the improvement of data skills can lead to tangible outcomes, such as improved decision making, enhanced operational efficiency or accurate data analysis.
While you plan to build the data skills of your team members, it’s important that you simultaneously empower everyone with the ability to read, understand, create and communicate data as information. This will enable your team and the organization to adapt to existing and fast-changing digital developments. Put simply, just as “literacy is our ability to read, write and comprehend language, data literacy is our ability to read, write and comprehend data.” (Data to the people, 2018). What does it mean for your team? Having the basics of: 1) Asking the right questions 2) Knowing what information needs to exist 3) Understanding the basics of data (which data, when to collect it, by whom) 4) Knowing the processes of how data is shared, accessed, presented, tracked and monitored. This doesn’t mean that each member of the team does a deep dive into technical and operational aspects of each process, but rather understands the work know-how and knows who and what to refer to when needed.
Checklist of useful basic data skills for governments
Data Fundamentals: The Data Fundamental modules provide a solid overview of the workflow with data, guiding you from what data is and how to make your data tell a story. (Source: School of Data)
Extracting Data: You know the data you need is somewhere out there on the Web, but how do you get it on your computer? This is where you can get some guidance. (Source: School of Data)
Collecting Data: Collecting data in the field for research or humanitarian purpose can be a complicated process. This module is a quick introduction to the right tools and a good process for mobile data collection. (Source: School of Data) For a further deep dive into ensuring efficient data collection and what all sources to consider, refer here.
Learning Public Datasets: Struggling with availability or access to certain data? This course shows how to find free, public sources of data on a variety of business, education and health issues, and you can download the data for your own analysis. (Source: LinkedIn Learning)
Analysing Data: Analysing data for public decision making can be complicated to say the least. This module is best suited once you’re familiar with accessing, extracting and collecting data. (Source: Udemy)
Presenting Data: Data presentation can mean data visualization. There are a million ways (and tools) to present data, and the best way is the one that works for your audience. This course will introduce you to some interesting tools and methods, allowing you to tackle data presentation in various ways. (Source: School of Data)
Data for Effective Policymaking (Understanding and Interpreting Data) :This module strengthens your ability to use, understand and interpret data. Using the "Numbers for Development" and "Caribbean Data Portal" platforms developed by the IDB, which presents data and socioeconomic indicators from the Latin American and Caribbean region, you’ll be able to understand how to improve the decision making process in public management. (Source: edX)
Note: These are only some examples of suggested open source trainings. The software as well as platforms are neither prescriptive nor exhaustive, rather an opportunity to start thinking of the best way to allocate and plan for resources around capacity building.
A mapping exercise undertaken by ITU, UNDP and the UN Office of Secretary General, this database takes stock of ongoing activities in the field of digital capacity development offered by different types of providers at the global and regional level, with a view to creating a searchable self-serve database of training activities. Specifically for data, it provides several recommendations regarding training material in data science, database management, bid data, artificial intelligence and digital literacy.
Technical skills are not enough! Focus on leadership and other soft skills to ensure continued upskilling.
While technical data skills are crucial for policymakers to effectively utilize data, simultaneously developing certain soft skills is equally important to maximize the impact of data-driven decision making.
Critical thinking skills to analyse and interpret data objectively, i.e., to identify biases or limitations and critically assess the validity of data-driven insights is essential. Practice analysing and evaluating information from various sources. Data doesn’t mean quantitative data only – consider data from all types of sources. Look for evidence, logical reasoning and potential alternative viewpoints. Encourage your team members to ask probing questions and challenge assumptions. Similarly, data can provide insights into the root causes of problems and potential solutions.
You should be able to apply problem-solving skills to identify data-driven strategies and interventions that address your issue. Engage in collaborative problem-solving activities with all stakeholders involved. Remember that many actors in the ecosystem – from civil society organizations to local leaders – may be able to provide evidence and grassroots perspectives. Participate in group discussions, workshops or brainstorming sessions where you can work together to tackle your complex problems.
Strong communication is crucial to effectively convey data-driven insights and recommendations. You should be able to distill complex data into clear and concise messages that resonate with different audiences, including the general public. Finding stories in data, developed by ODI, provides a series of lessons on how to find and tell compelling stories with data and present them in a way that has impact.
The Skills Framework by ODI provides an overview of technical skills as well as soft skills that are helpful when working with data.
While the navigator recommends no one software over the other, some of the most commonly used programmes globally are listed below. Consider them (or programmes with similar functions) in building your data architecture.
Microsoft Excel is a widely adopted spreadsheet software that offers basic data analysis capabilities. It's commonly used for data entry, manipulation and basic statistical analysis.
Tableau is a powerful data visualization tool that allows users to create interactive and visually appealing dashboards and reports. It's widely used in governments to analyse and present data in a user-friendly and intuitive way.
Power BI is a business analytics tool offered by Microsoft. It allows users to connect to various data sources, create interactive dashboards and reports and share insights with others.
R and Python are both programming languages and software environments specifically designed for statistical computing and graphics. They are open-source and widely used in the data science community.
Geographic Information System (GIS) software is used extensively in government for spatial data analysis and mapping. Popular GIS tools include ArcGIS, QGIS and GeoDa, which allow governments to analyse geospatial data, visualize patterns and support decision making in areas such as urban planning, disaster management and environmental assessments.
III. Operationalize data capacity building
To operationalize data capacity building, it’s important to build a strategic vision for what outcomes are aspired in short-, medium- and long-term. Need to use data for informing decisions in times of crises? You may need some immediate short-term solutions. However, these short-term solutions need to be complimented with medium- and long-term measures to ensure that you’re better equipped to face such crises in future.
Short-term approaches (0-1 year):
Training workshops: These are focused training workshops to introduce basic data concepts, tools and techniques to government officials aligned with the gaps identified based on skills assessment. These workshops can cover technical topics around data analysis, visualization and cleaning as well as overarching topics of knowledge management, data flows and understanding overall organizational processes of data.
Online learning resources: It might be worth exploring the open sources for training (mentioned in this navigator and otherwise) for your team. Generally, providing your team awareness about and access to online courses, tutorials and resources that enable self-paced learning can be useful. Platforms like Coursera, edX and DataCamp offer a variety of data-related courses that can be accessed by you (and your team).
Knowledge sharing on global innovations and use cases: Global use cases can inspire government officials by showcasing successful applications of data skills in various domains and sectors. Learning about how data has been leveraged in different parts of the world can generate ideas and spark creativity in finding innovative solutions to local challenges.
Medium-term approaches (1-3 years):
Internal data champions: Identify individuals that can be responsible for driving data initiatives and promoting data literacy and skills development within your team and organization.
Periodic data skills training: Tailor data skills training for your team. Focus on the specific data needs, challenges and applications relevant to your team or organization, ensuring that the training is practical and directly applicable to their work. For example, understanding the fundamentals of epidemiology and biostatistics is important for health policymakers. Similarly, the ability to integrate and harmonize data from diverse educational databases, assessments, surveys and administrative systems is valuable for policymakers in the education sector.
Pilot data projects: Undertake small-scale data projects to apply and reinforce the skills learned. These pilot projects can demonstrate the value of data-driven approaches and build confidence among officials in using data for decision making.
Build complementary skills with other stakeholders: Collaboration and partnership with the private sector can provide valuable opportunities for knowledge exchange as well as leveraging their expertise. For example, you may need to look at satellite imagery or other forms of new data sources for a specific infrastructure- related challenge. Building collaborative approaches with private sector entities collecting such data and leveraging their skills may be more cost effective and efficient in some scenarios. It’s important to think about building these partnerships in a systematic way such that complementary skills from different stakeholders and sectors can be beneficial to all.
Build on local knowledge and capacities: It’s also important to develop capacity development initiatives that build on the local knowledge - insights, experiences and context-specific information held by individuals, communities and organizations within a particular geographic area, within local governments and within local grassroots organizations.
Build network of data practitioners across your organization: Building a network of data practitioners within your organization can foster collaboration and knowledge sharing. Identify personnel with common interest areas around use of data, organize them and facilitate opportunities for deeper discussions and collaboration.
Long-term approaches (3+ years):
Embed data skills in job roles: Incorporate data skills as a core requirement in job descriptions and performance evaluations for relevant positions. This ensures that data skills are considered essential competencies for government officials and helps create a culture of data-driven decision making.
Institutionalize data training programs: Develop comprehensive, ongoing training programs that are integrated into the regular training curriculum of the government organization. This can include in-house training sessions, certification programs and continuous learning opportunities to keep officials updated with evolving data techniques and tools. Refer to this section for further guidance and best practices on institutional mechanisms for enabling team-wide data-driven decision making.
Partner with think tanks and external organizations: For such continuous learning opportunities, these partnerships may also be beneficial as they’re in a better position to keep up with the evolving needs and challenges in the public policy sphere. For example, J-PAL collaborated with the Department of Personnel and Training (DOPT) of the Government of India to create learning opportunities for the entire ecosystem of public officials.
Partner with academia and local universities: Local universities often have specialized research expertise and academic resources that can contribute to data-driven decision making as well as systematic capacity building of local government officials. In addition, partnering with universities can facilitate access to data repositories, survey findings and other relevant datasets that can be leveraged for evidence-based policymaking.
These approaches should be adapted and combined based on the specific needs, resources and priorities of each government organization. It's important to recognize that building data skills is an ongoing journey, and continuous investment and support of resources and political will is necessary to sustain and enhance data capabilities in the long run.
A key focus of skilling workforce with data capacities will be on new and fast changing trends in the space. Big Data Analytics, Artificial Intelligence, Real-time Data and Sensor Networks are a few trends that are exponentially changing the skilling requirements in the space.
Governments are increasingly harnessing big data and advanced analytics techniques to analyze large and complex datasets. This includes leveraging technologies such as machine learning, artificial intelligence, and predictive analytics to gain insights, identify patterns, and make data-driven predictions for various purposes, including policy formulation, resource allocation, and service delivery optimization.
Open Data and Data Transparency: Governments are embracing open data initiatives, making public datasets accessible and freely available to citizens, businesses, researchers, and developers. ‘Open and Smart Government' is a MOOC by TU Delft, that aims to introduce key principles around open government and look at current trends and global developments in this field by comparing the release of data by governments in Europe and the USA. Open data essentials is Open Data Institute's e-Learning programme developed for the European Commission, covering essentials of open data, how to plan and measure success and how to implement open data programme technically.
Real-time Data and Sensor Networks: Governments are increasingly utilizing real-time data streams from sensors, IoT devices, and social media platforms to monitor and respond to dynamic situations. Real-time data enables governments to detect patterns, anticipate trends, and make timely decisions in areas such as public safety, emergency response, and urban management.
As governments explore the use of artificial intelligence (AI) technologies, there is a growing focus on ethical AI practices and responsible data use. Governments are developing guidelines and frameworks to ensure fairness, transparency, and accountability in AI algorithms and to address potential biases and discriminatory outcomes. This online module developed with the funding of GIZ provides guidance to enhance the capacity of policy makers to respond to the challenges and benefits of AI in strengthening systems of governance and sustainable development, by institutionalising policies that promote use of AI in ways that are inclusive, responsible and sustainable. The AI Readiness Assessment, developed by UNDP, comprises a comprehensive set of tools that allow governments to get an overview of the AI landscape and assess their level of AI readiness across various sectors. The framework is focused on the dual roles of governments as 1) facilitators of technological advancement and 2) users of AI in the public sector.
Sustaining data capacities at a systemic level
While all capacity building initiatives should be designed from a systems perspective, these efforts are often for ad-hoc demands in siloed departments of the organization. This is why a lot of data demands are bridged by third party contractors that often lack the topical knowledge or understanding of different gaps in your system. Therefore, there’s a need for sustained efforts to enhance the capabilities, resources and structures within the entire system.
Continuously enhancing data infrastructure: Invest in the necessary data infrastructure to support data management, analysis and sharing. This may involve setting up data repositories, data integration platforms and data analytics tools and securing data storage systems. In addition, think about building technology roadmaps that can help identify the specific technological needs and requirements for data-related processes. Then assess the current state of technology infrastructure, tools and systems within your government and how they need to be developed. Of course, you need to plan for resources for this!
Allocating budget to systems capacity building with respect to data: You probably already know the sources of funds for the execution of statistical activities and, e.g., whether they’re provided by the national government, private actors, etc. While planning for annual budgets, it’s important to consider the resources needed to conduct additional data capacity building activities and revise it every year.
Building robust statistical production processes and code of conduct: Robust statistical production processes help maintain high standards of data quality. These processes involve rigorous methodologies, data collection protocols, data validation and verification procedures and adherence to statistical standards. Refer here for guidelines on data quality assurance. Note: You may not always be producing data as it often sits with the national statistical offices, however, it’s important to evaluate it based on the validated methodologies and standards. In addition, building processes for data integration and interoperability will enable you to access and combine data from multiple sources to gain a holistic view of the issues at hand. It’s also important to build feedback loops and consistently evaluate the impact of data skills on decision making and how it can be improved.
Institutionalize trainings and tools: Integrating data skills training into onboarding and professional development programs, or having dedicated units within the government responsible for policymaker training, will ensure that you can sustain and upgrade capacity building initiatives.
Monitor and evaluate progress: Continuously monitor and evaluate the progress of data capacity building efforts. Set key performance indicators (KPIs) to measure the impact of data-driven initiatives and regularly assess the effectiveness of skills development programs. Share learnings, limitations and successes across teams and counterparts.
Build interdisciplinary teams: Assemble interdisciplinary teams that include professionals with diverse data skills. By bringing together individuals with expertise in data analysis, statistics, programming and policy domains, you can leverage the collective skills and knowledge to make more informed decisions.
Where do we go from here?
Building data skills is an ongoing process. Where do we go from here? Embrace the learning journey, seek continuous improvement and adapt to new technologies and methodologies as they emerge in the field of data.
Stay informed about emerging trends and technologies: Stay updated on the latest trends, technologies and best practices related to data-driven decision making. Following reputable sources, attending conferences and workshops and engaging with experts in the field will come in handy.
Finally, the key to building sustained data capacities is promoting a data-driven culture within your organization. This involves encouraging the use of data in decision making, advocating for the adoption of data-driven practices and setting an example by incorporating data analysis into your own work.