Neighborhoods of Mexico City and some of their characteristics

Note

In this file, I import colonias_imc2020.shp and merge some redudant neighborhoods together. I save the resulting shapefile, the one I always use as a reference for neighborhoods in output/colonias.gpkg.

1 Neighborhoods of Mexico City

Mexico is organised in 16 districts, or alcadias (equivalent to municipios outside of CDMX). Each district is then broken down in neighborhoods, or colonias. There are about 2000 colonias in Mexico City, however, the number of colonias varies according to the source.

The shapefile for neighborhoods used in this analysis comes from the national population council (CONAPO). They provide a shapefile with the boundaries of the colonias in Mexico City. There is no very official delimitations of the colonias, so this is what I work with.

Since the call data is referenced by name of the neighborhoods, I need the “neighborhood framework” to locate the calls. It is not ideal, which is why I used another level for the crime data, namely AGEBs. AGEB are division of urban territory, developped bu INEGI (the national statistics institute), and are used for the census. They are more homogenous in terms of population and area and are the reference level for all the neighborhood characteristics I gather in the next script.

conapo <-
  read_sf(here("..",
               "raw_data",
               "1_colonias",
               "IMUC_20",
               "imc2020_shp",
               "colonias_imc2020.shp"), 
          quiet = T) %>% 
  rename_with(tolower) %>%
  filter(nom_ent == "Ciudad de México") %>% 
  select(-id_col) %>% 
  st_transform(4326) %>% 
  st_make_valid()

conapo %>% 
  ggplot() +
  geom_sf(aes(fill = nom_mun), alpha = .4) +
  # lable legend title
  scale_fill_hue(name = "Alcaldias") +
  annotation_scale(location = "br", style = "ticks") +
  labs(title = "Colonias and alcadias in Mexico City") +
  theme_minimal()

Select relevant attributes:

conapo <- conapo %>% 
  select(objectid, mun, nom_mun, colonia, pobtot)

2 Spatially merge some neighborhood together

There are some colonias that have the same name and correspond to the same colonia. There are basically two entries for a single colonia. I want to merge them.

conapo <- conapo %>% 
  arrange(nom_mun, colonia, objectid) %>%
  group_by(nom_mun, colonia) %>% 
  mutate(to_merge = n() > 1,
         col_id = cur_group_id(),
         .before = everything()) %>% 
  select(-objectid)

Most often, the polygon of one is contained within the other, sometimes they are next to each other. Here is an example for one of the 16 district, Azcapotzalco.

conapo %>% 
  filter(nom_mun == "Azcapotzalco") %>% 
  ggplot() +
  geom_sf(aes(fill = to_merge)) +
  scale_fill_brewer(name = "Have been merged",
                    palette = 5,
                    direction = 1) +
  annotation_scale(location = "br", style = "ticks")

I merge the polygons of the colonias that have the same name and are next to each other. I spatially aggregate the polygons:

merged <- 
  conapo %>% 
  filter(to_merge) %>% 
  group_by(col_id, mun, nom_mun, colonia, to_merge) %>% 
  summarise(geometry = st_union(geometry),
            pobtot = sum(pobtot)) 
#Remove the polygons that were merged from the original dataset, and add the merged polygons.
colonias <- 
  conapo %>% 
  filter(!to_merge) %>% 
  bind_rows(merged) %>% 
  arrange(col_id) %>% 
  rename(merged = to_merge)

rm(merged, attributes, conapo)

3 Save the neighborhoods shapefile

colonias %>% 
  write_sf(here("..",
                "output",
                "colonias.gpkg"))

4 Population in the neighborhoods

There are now 1948 colonias, after the spatial match (concerned only few obs at the end of the day).

The problem with the neighborhoods is that they capture very different sizes of area, both geographically, and in terms of population. In that sense, AGEBs are superior.

colonias %>% 
  ggplot() +
  geom_sf(aes(fill = pobtot)) +
  scale_fill_distiller(name = "population",
                    palette = 1,
                    direction = 1) +
  annotation_scale(location = "br", style = "ticks") +
  labs(title = "Population by neighborhood: original delimitations in CONAPO") +
  theme_minimal()

Plot the distribution of population per neighborhood.

colonias %>% 
  ggplot(aes(x = pobtot)) +
  geom_density(aes(y = ..count..),
               alpha = .2, fill = "red") +
  labs(title = "Density of neighborhood's population")

Or zooming in:

colonias %>% 
  #mutate(pobtot = if_else(pobtot > 20000, 20000, pobtot)) %>% 
  ggplot(aes(pobtot)) +
  geom_histogram(aes(y=..count../sum(..count..)), binwidth = 100) +
  scale_y_continuous(labels = scales::percent_format()) +
  labs(title = "Zooming on the lower end",
       x = "Population (bins of width 100)",
       y = "Percentage of neighborhoods") +
    coord_cartesian(xlim=c(-10,10000))

Another map, maxing out pobtot to reveal some more heterogeneity:

colonias %>% 
  mutate(pobtot_maxxed = case_when(pobtot > 20000 ~ 20000,
                                   TRUE ~ pobtot)) %>%
  ggplot() +
  geom_sf(aes(fill = pobtot_maxxed), color = gray(.7), alpha = .9) +
  annotation_scale(location = "br", style = "ticks") +
  scale_fill_distiller(type = "seq", palette = 3, direction = 1,
                       breaks = c(200, 10000, 20000),
                       labels = c("0", "10000", "+20000")) +
  theme(legend.position = "bottom",
        legend.title = element_blank()) +
  labs(title = "Population per neighborhood",
       subtitle = "The color scale is capped at 20000")