Write an action plan

Having identified your data gaps and needs, are you now looking for data on a specific topic? Are you unclear about where these data lie? Are these data collected by several agencies within and outside of your government? Mapping your current data ecosystem can help identify data sources, data stewards, data users, their roles and how they interact with each other.


On average, 11 women are murdered daily. Between 2000 and 2019, most of the victims were women between the ages of 20 and 24. In 2019, more than 50% of female homicides occurred in public spaces.​

Violence against women at the community level, which is perpetrated by an individual or a collective unknown to the victim, occurs in streets, parks, and, to a lesser extent, on buses, minibuses, or subways. The attacks that occur on the street are mainly sexual (66%) and include catcalling, bullying, stalking, sexual abuse, rape, and attempted rape. However, 78% of women and girls over 15 years old do not report these incidents. ​

This problem is relevant because it limits women’s freedom of movement and restricts their right to the city, which is stipulated in Mexico City’s Constitution. Additionally, it limits women’s access to work and education opportunities, access to essential services, their participation in cultural and leisure activities, and their full participation in public life.​

It is in this context that the Women’s Secretariat of the Mexico City government, GIZ’s Data Lab, and the UNDP Accelerator Labs began a project to identify areas in Mexico City where women are safer — with a particular focus on public spaces. The results were presented at a session chaired by Mexico City’s Women’s Secretariat, which was attended by representatives of 16 government agencies of Mexico City. ​

Results presented at First Ordinary Session of the Cabinet Council for the Monitoring of Public Policies of Substantive Equality of the Government of Mexico City, May 2022
“The recommendations [from this research] serve to strengthen the work and strategies that government agencies already have underway to improve the security of public space for women”

— Ingrid Gómez Saracíbar, Head of the Women's Secretariat

Use case context​

In Mexico, two-thirds of all girls and women above the age of 15 have reported experiencing at least one incident of violence in their lifetime.

The Data Powered Positive Deviance (DPPD) method focuses on outliers, or positive deviants, and seeks to discover why some data points perform better than others. In this case, why some public spaces are safer for women. ​

First, they developed a step-by-step process for discovering public spaces with better performance in terms of security for women. This method included starting with mapping the relevant data sources, carrying out a homogeneous grouping, and defining performance measures to identify positive deviants. Next came the fieldwork – both quantitative and qualitative – to collect and analyse the positively deviant underlying factors. ​

Step-by-step process of the Data Powered Positive Deviance method. Currently we are working on the positive deviant´s identification.
 Alejandra Cevantes

First, the project team conducted a series of interviews with experts, including academics, activists, urban planners, and public officials with relevant experience on violence against women. This exercise provided initial guidance on the factors that make public spaces safer for women: urban infrastructure, security infrastructure, people, usage of space, and mobility. Furthermore, initial datasets for the analysis were identified.​

Then, they extensively mapped public and non-public datasets. The initial mapping was a wish list of 67 datasets that included urban infrastructure, population, commuting patterns, socioeconomic index, security, and justice. ​

These datasets came primarily from two sources: the Open Data Portal of Mexico City and the National Institute of Statistics and Geography. Non-public datasets owned by public and private entities, such as mobile data, 911 reports, and usage of panic buttons were also considered​

The most important data from several datasets based on their relevance, level of aggregation, and when they were last updated were identified. It was important to get information as granular as possible because we want to use geographic units of analysis that are small and precise enough to easily uncover the underlying factors behind positively deviant public spaces. They selected open data related to urban infrastructure (e.g., subway stations, bus stations), land usage, security infrastructure (e.g., location of panic buttons, cameras), census data, and marginalisation indexes for analysis. They also used the Attorney General’s Office dataset, which contains updated information on crime victims in investigation files in Mexico City.​

How to find the areas of the city that performed better​

First, the project team defined the unit for their analysis: AGEBS, the basic geostatistical areas used in Mexico. ​

Next, they divided the AGEBs into homogenous groupings, based on population density and incoming daily trips of commuters (variables related to presence of crime) and based on the Marginalization Index of Mexico’s National Population Council (summarises variables related to socioeconomic characteristics). The resulting clusters of AGEBs are shown below.

Overview of the 5-stage model
 Alejandra Cevantes

Developing a useful dataset​

To identify outperforming or positively deviant AGEBs, the project team needed to find a reliable measure of performance. In other words: data that helps them identify which AGEBs record a lower number of crimes against women than expected. To this aim, they used the dataset of victims in investigation files of the Attorney General’s Office, which covers 2019 and 2020.​

This dataset included: 1) type of crime, 2) day of the week the crime occurred, 3) time of day it occurred, 4) age of the victim, 5) gender of the victim, and 6) geolocation of the occurrence. This last aspect is particularly important, as it makes it possible to associate crime numbers to their respective AGEBs.​

This dataset was then adjusted to only include crimes in public spaces and gender-based violence crimes that could occur in public spaces (e.g., sexual assault, feminicide). Then the analysts categorized the crimes by severity and impact on women. ​

Cluster analysis shows the diversity of Mexico City.​
 Alejandra Cevantes

Statistical methods used​

The next step of the DPPD method is to identify the positive deviants through quantitative analysis. In the case of this pilot, they identifying AGEBs that, while controlling for the relevant characteristics of these spaces, present a lower number of female victims than predicted.​

Statistical modelling is necessary for finding positive deviants because the project team needed to predict the performance measure, female victims in investigation files, with the model. Then, the focus must turn to the residuals, the differences between observed and predicted values. If there are positive outliers in the residuals, that is a positive deviant. That is, there are positive deviants when the number of victims observed per AGEB is much lower than the number of victims predicted by the model.​

For statistical modelling, they defined the independent variables, those that are going to predict the number of female victims. After defining the independent variables, three types of regression analysis were performed (multiple linear regression, LASSO linear regression, and negative binomial regression). ​

Qualitative interviews with women from Mexico City supplemented the statistical modelling and gave a better picture of their experiences, needs and desires.

How can better data contribute to better policy? ​

Among the key insights is that the most relevant variables identified in the three regressions were population size, AGEB area, financial services, restaurants, and bars per AGEB, as well as the distance to the closest Metrobús and metro station. This last variable has a negative relationship with the number of victims, which means that when the distance between the center of the AGEB and the metro station is shorter, a larger number of victims is expected. The three models explain on average 40% of the variance of the result variable — the performance measure — among the different clusters, providing a good starting point to define positive deviants.​

Among the findings, the importance of taking advantage of specific characteristics of green areas in city planning to increase the perception of security or promote greater traffic of women, children, families and older adults stands out. This promotes the natural surveillance of public spaces. ​

Where do we go from here?​

This analysis and the recommendations strengthen the work and strategies that government agencies in Mexico already have underway to improve the security of women in public spaces. ​

The next step is to work with the technical teams of the government to transfer the learning to them so that they can incorporate them into their work, and thus build safer public spaces for women who live, transit, or work in different areas of Mexico City. ​

Case Downloads

POOBEapUPDARTEDer: Investigating the PotenTO betial of Mobile
PDF (560kb)

Further ressources

Related Use Cases

Breaking Barriers: Reinforcing Gender Data Analysis and Use with the Gender Data Lab Initiative
Learn More
Improving Migration and Displacement Policies in East Africa
Learn More
From Data to Action: Creating Safer Public Spaces for Women in Mexico City
Learn More
Mexico (Women in Business)
Enabling Women's Economic Empowerment: An AI driven approach to Gender Equality
Learn More
Understanding Policy Effectiveness Using New Data Sources – Lessons Learned From COVID-19 in Maharashtra
Learn More
Moldova (Energy Vulnerability)
Data-Driven Collaboration: Using Technology to Support Refugee Management in Moldova
Learn More
Moldova (Refugee Mobility)
Tackling Moldova’s energy vulnerability using an online compensation system
Learn More
The Power of Open-Access Data to Mitigate Flooding in Indonesia
Learn More
Using Online Job Vacancy Data for more evidence-informed labour policies in Vietnam
Learn More
Leveraging Big Data to Build Sustainable and Connected Cities: The Quito Case
Learn More
Mapping Zambian Urbanisation using Geospatial Data
Learn More
Using Mobile Phone Data for Effective Public Health Measures during Pandemics
Learn More
Using big data to understand movement around Buenos Aires
Learn More
Using Citizen-Generated Data to Tackle Marine Pollution
Learn More
Unlocking Success in Rainfed Farming
Learn More
Costa Rica
Using spatial planning to reduce climate change threats
Learn More
Improving Health Care through Data-Driven Placement Decisions
Learn More