Heterogeneity: poverty

Note

I run the heterogeneity on poverty. I have a categorical variable thats not great, but I also have a continuous variable that is the share of people living in poverty.

From the continuous, I bin population into ntiles and run similar regressions to the ones on income.

library(tidyverse)
library(here) # path control
library(fixest) # estimations with fixed effects
library(knitr)
library(sf)
library(tictoc)
tic("Total render time")

source("fun_dynamic_bins.R")
source("fixest_etable_options.R")

mem.maxVSize(vsize = 40000)
[1] 40000
data <-
  read_rds(here("..", "output", "data_reports.rds")) %>% 
  filter(covid_phase == "Pre-pandemic")

data <- data %>% 
  mutate(prec_quintile = ntile(prec, 5),
         rh_quintile = ntile(rh, 5),
         wsp_quintile = ntile(wsp, 5))

ageb_attributes <- 
  readRDS(here("..", "output", "ageb_attributes.rds"))

poligonos <-  
  read_sf(here("..",
                "output",
                "ageb_poligonos.geojson"),
          quiet = TRUE)

Compute ntiles for poverty and detail the data

poverty_nbi is defined as the share of “Población pobre por Necesidades Básicas Insatisfechas (NBI)” where poverty is defined considering both income and social deprivation… I also have a more raw variable for poverty, that is category of share of ppl living in poverty, named pobreza. I discuss both these variables and their source on this page (AGEBs in Mexico City).

I can relate the continuous and categorical poverty variables to log(income).

ageb_attributes %>% 
  # scatter poverty_nbi vs income_std
  ggplot(aes(x = poverty_nbi, y = income_log, color = pobreza)) +
  geom_point() +
  scale_color_brewer(palette = "RdYlBu", direction = -1, na.value = "grey90") +
  theme_minimal()

And some statistics by category of poverty.

ageb_attributes %>% 
  group_by(pobreza) %>%
  summarise(
    mean(poverty_nbi, na.rm = T),
    mean(income, na.rm = T),
    pobtot = sum(pobtot)
    ) %>% 
  mutate(share_pob = 100*pobtot/sum(pobtot)) %>% 
  drop_na() %>% 
  kable() 
pobreza mean(poverty_nbi, na.rm = T) mean(income, na.rm = T) pobtot share_pob
[ 0, 18] 0.4426136 60331.36 3462481 37.8888429
(18, 34] 0.7129382 38325.60 2806078 30.7060309
(34, 50] 0.8333837 31669.46 2156477 23.5976510
(50, 70] 0.9004305 26422.92 626550 6.8561400
(70, 100] 0.9670625 22977.49 32053 0.3507459

Compute some ntiles of poverty_nbi, we weight by AGEB’s population.

ageb_attributes <- ageb_attributes %>% 
  arrange(poverty_nbi) %>% 
  mutate(
    # Population-Weighted Deciles
    poverty_nbi_decile = ifelse(
      is.na(income) | is.na(pobtot), NA, 
      as_factor(cut(
        cumsum(pobtot[!is.na(poverty_nbi)]) / sum(pobtot[!is.na(poverty_nbi)], na.rm = TRUE), 
        breaks = seq(0, 1, by = 0.1), 
        labels = 1:10, 
        include.lowest = TRUE
      ))
    )) %>% 
  mutate(
    poverty_nbi_quintile = case_when(
      poverty_nbi_decile %in% c("1", "2") ~ "1",
      poverty_nbi_decile %in% c("3", "4") ~ "2",
      poverty_nbi_decile %in% c("5", "6") ~ "3",
      poverty_nbi_decile %in% c("7", "8") ~ "4",
      poverty_nbi_decile %in% c("9", "10") ~ "5",
      TRUE ~ NA),
    poverty_nbi_median = case_when(
      poverty_nbi_decile %in% c("1", "2", "3", "4", "5") ~ "Below Median",
      poverty_nbi_decile %in% c("6", "7", "8", "9", "10") ~ "Above Median",
      TRUE ~ NA)
  ) %>% 
  mutate(
    poverty_nbi_decile = as_factor(poverty_nbi_decile),
    poverty_nbi_quintile = as_factor(poverty_nbi_quintile),
    poverty_nbi_median = as_factor(poverty_nbi_median)
  )

I plot those new poverty quintiles. Now, the number of AGEB in each is much more balanced (altough not equal since we bin by population). We see that it does not correspond to the quintiles of income (would be horizontal cuts on the plot).

ageb_attributes %>% 
  # scatter poverty_nbi vs income_std
  ggplot(aes(x = poverty_nbi, y = income_log, color = poverty_nbi_quintile)) +
  geom_point() +
  scale_color_brewer(palette = "RdYlBu", direction = -1, na.value = "grey90") +
  theme_minimal() 

I join these attributes to the main panel

data <- 
  left_join(data, ageb_attributes)

Baseline regressions for reference

First the baseline regressions for reference:

baseline_linear <- data %>% 
  fepois(reports_dv ~ tmean |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

baseline_bins <- data %>% 
  mutate(bins_tmean_one = create_dynamic_bins(tmean, width = 1, 0.01)) %>% 
  fepois(reports_dv ~ i(bins_tmean_one, ref = "[16") |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

Regs with pobreza (categorical only)

Either temp enters linearly:

reg_pobreza_linear <- data %>% 
  fepois(reports_dv ~ i(pobreza, tmean) |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

reg_pobreza_linear %>% 
  etable() %>% 
  kable()
.
Dependent Var.: Reports for DV
Temperature x pobreza = [0,18] 0.0200*** (0.0039)
Temperature x pobreza = (18,34] 0.0334*** (0.0042)
Temperature x pobreza = (34,50] 0.0283*** (0.0043)
Temperature x pobreza = (50,70] 0.0329*** (0.0070)
Temperature x pobreza = (70,100] 0.0317 (0.0281)
Fixed-Effects: ——————
Prec, hum, wsp quintiles Yes
Neighborhood Yes
Year-Month Yes
Day of Week Yes
Day of Year Yes
________________________________ __________________
S.E.: Clustered by: Neighborhood
Observations 3,543,502

Or temp enters semi parametrically:

reg_pobreza_bins <- data %>% 
  mutate(bins_tmean_one = create_dynamic_bins(tmean, width = 1, 0.01)) %>% 
  fepois(reports_dv ~ i(pobreza, bins_tmean_one, ref2 = "[16") |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

I plot these results.

Error: object 'coefs_quintile' not found

Regs with poverty_nbi (continuous)

This variable is continuous, so I can use it as a continuous variable or bin it.

Gradient-style heterogeneity

Because we can, but not interpretable as such.

reg_poverty_cont_linear <- data %>% 
  fepois(reports_dv ~ tmean * poverty_nbi |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

etable(baseline_linear, reg_poverty_cont_linear) %>% 
  kable()
baseline_linear reg_poverty_cont_li..
Dependent Var.: Reports for DV Reports for DV
Temperature 0.0274*** (0.0033) 0.0077 (0.0063)
poverty_nbi -1.3969 (41,209.5177)
Temperature x poverty_nbi 0.0288*** (0.0077)
Fixed-Effects: —————— ———————
Prec, hum, wsp quintiles Yes Yes
Neighborhood Yes Yes
Year-Month Yes Yes
Day of Week Yes Yes
Day of Year Yes Yes
_________________________ __________________ _____________________
S.E.: Clustered by: Neighborhood by: Neighborhood
Observations 3,624,826 3,576,634

N-tiles regressions

With the n-tiles I defined based on the continuous variable poverty_nbi.

Below and above median

First the simple linear version:

reg_above_below_linear <- data %>%
  fepois(reports_dv ~ i(poverty_nbi_median, tmean) |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

etable(baseline_linear, reg_above_below_linear) %>% 
  kable()
baseline_linear reg_above_below_..
Dependent Var.: Reports for DV Reports for DV
Temperature 0.0274*** (0.0033)
Temperature x poverty_nbi_median = BelowMedian 0.0232*** (0.0038)
Temperature x poverty_nbi_median = AboveMedian 0.0311*** (0.0037)
Fixed-Effects: —————— ——————
Prec, hum, wsp quintiles Yes Yes
Neighborhood Yes Yes
Year-Month Yes Yes
Day of Week Yes Yes
Day of Year Yes Yes
________________________________________ __________________ __________________
S.E.: Clustered by: Neighborhood by: Neighborhood
Observations 3,624,826 3,578,253

Or semi parametric with bins:

reg_above_below_bins <- data %>%
  mutate(bins_tmean_one = create_dynamic_bins(tmean, width = 1, 0.01)) %>% 
  fepois(reports_dv ~ i(poverty_nbi_median, bins_tmean_one, ref2 = "[16") |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

I can plot these results and save as reg_poverty_median.png

Quintile regressions

I first run the model imposing linearity on tmean and interacting it with poverty quintiles.

reg_quintile_linear <- data %>%
  fepois(reports_dv ~ i(poverty_nbi_quintile, tmean) |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

etable(reg_quintile_linear) %>% 
  kable()
reg_quintile_lin..
Dependent Var.: Reports for DV
Temperature x poverty_nbi_quintile = 1 0.0149** (0.0048)
Temperature x poverty_nbi_quintile = 2 0.0274*** (0.0044)
Temperature x poverty_nbi_quintile = 3 0.0279*** (0.0046)
Temperature x poverty_nbi_quintile = 4 0.0303*** (0.0044)
Temperature x poverty_nbi_quintile = 5 0.0339*** (0.0047)
Fixed-Effects: ——————
Prec, hum, wsp quintiles Yes
Neighborhood Yes
Year-Month Yes
Day of Week Yes
Day of Year Yes
______________________________________ __________________
S.E.: Clustered by: Neighborhood
Observations 3,578,253

Export the table as reg_poverty_quintile.tex

I then allow for possible non-linearities in the relationship between temperature and DV. With the quintiles, its a bit of a crazy estimation…

tic("quintile")
reg_quintile_bins <- data %>%
  mutate(bins_tmean_one = create_dynamic_bins(tmean, width = 1, 0.01)) %>% 
  fepois(reports_dv ~ i(poverty_nbi_quintile, bins_tmean_one, ref2 = "[16") |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)
toc()
quintile: 953.265 sec elapsed

I plot these results.

I save the full results as reg_poverty_quintiles_bins_full.png.

And only the two extreme quintiles as reg_income_quintiles_bins_full.png

Decile

tic("decile")
reg_decile_linear <- data %>%
  fepois(reports_dv ~ i(poverty_nbi_decile, tmean) |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)
toc()
decile: 231.703 sec elapsed
etable(reg_decile_linear) %>% 
  kable()
reg_decile_linear
Dependent Var.: Reports for DV
Temperature x poverty_nbi_decile = 1 0.0106. (0.0060)
Temperature x poverty_nbi_decile = 2 0.0188** (0.0061)
Temperature x poverty_nbi_decile = 3 0.0291*** (0.0053)
Temperature x poverty_nbi_decile = 4 0.0260*** (0.0055)
Temperature x poverty_nbi_decile = 5 0.0278*** (0.0058)
Temperature x poverty_nbi_decile = 6 0.0281*** (0.0058)
Temperature x poverty_nbi_decile = 7 0.0312*** (0.0056)
Temperature x poverty_nbi_decile = 8 0.0295*** (0.0055)
Temperature x poverty_nbi_decile = 9 0.0374*** (0.0060)
Temperature x poverty_nbi_decile = 10 0.0298*** (0.0060)
Fixed-Effects: ——————
Prec, hum, wsp quintiles Yes
Neighborhood Yes
Year-Month Yes
Day of Week Yes
Day of Year Yes
_____________________________________ __________________
S.E.: Clustered by: Neighborhood
Observations 3,578,253
Rendered on: 2025-04-04 18:56:19
Total render time: 3641.51 sec elapsed