Heterogeneity: poverty

Note

I run the heterogeneity on poverty. I have a categorical variable thats not great, but I also have a continuous variable that is the share of people living in poverty.

From the continuous, I bin population into ntiles and run similar regressions to the ones on income.

library(tidyverse)
library(here) # path control
library(fixest) # estimations with fixed effects
library(knitr)
library(sf)
library(tictoc)
tic("Total render time")

source("fun_dynamic_bins.R")
source("fixest_etable_options.R")

mem.maxVSize(vsize = 40000)

[1] 40000

data <-
  read_rds(here("..", "output", "data_reports.rds")) %>% 
  filter(covid_phase == "Pre-pandemic")

data <- data %>% 
  mutate(prec_quintile = ntile(prec, 5),
         rh_quintile = ntile(rh, 5),
         wsp_quintile = ntile(wsp, 5))

ageb_attributes <- 
  readRDS(here("..", "output", "ageb_attributes.rds"))

poligonos <-  
  read_sf(here("..",
                "output",
                "ageb_poligonos.geojson"),
          quiet = TRUE)

Compute ntiles for poverty and detail the data

poverty_nbi is defined as the share of “Población pobre por Necesidades Básicas Insatisfechas (NBI)” where poverty is defined considering both income and social deprivation… I also have a more raw variable for poverty, that is category of share of ppl living in poverty, named pobreza. I discuss both these variables and their source on this page (AGEBs in Mexico City).

I can relate the continuous and categorical poverty variables to log(income).

ageb_attributes %>% 
  # scatter poverty_nbi vs income_std
  ggplot(aes(x = poverty_nbi, y = income_log, color = pobreza)) +
  geom_point() +
  scale_color_brewer(palette = "RdYlBu", direction = -1, na.value = "grey90") +
  theme_minimal()

And some statistics by category of poverty.

ageb_attributes %>% 
  group_by(pobreza) %>%
  summarise(
    mean(poverty_nbi, na.rm = T),
    mean(income, na.rm = T),
    pobtot = sum(pobtot)
    ) %>% 
  mutate(share_pob = 100*pobtot/sum(pobtot)) %>% 
  drop_na() %>% 
  kable()

pobreza	mean(poverty_nbi, na.rm = T)	mean(income, na.rm = T)	pobtot	share_pob
[ 0, 18]	0.4426136	60331.36	3462481	37.8888429
(18, 34]	0.7129382	38325.60	2806078	30.7060309
(34, 50]	0.8333837	31669.46	2156477	23.5976510
(50, 70]	0.9004305	26422.92	626550	6.8561400
(70, 100]	0.9670625	22977.49	32053	0.3507459

Compute some ntiles of poverty_nbi, we weight by AGEB’s population.

ageb_attributes <- ageb_attributes %>% 
  arrange(poverty_nbi) %>% 
  mutate(
    # Population-Weighted Deciles
    poverty_nbi_decile = ifelse(
      is.na(income) | is.na(pobtot), NA, 
      as_factor(cut(
        cumsum(pobtot[!is.na(poverty_nbi)]) / sum(pobtot[!is.na(poverty_nbi)], na.rm = TRUE), 
        breaks = seq(0, 1, by = 0.1), 
        labels = 1:10, 
        include.lowest = TRUE
      ))
    )) %>% 
  mutate(
    poverty_nbi_quintile = case_when(
      poverty_nbi_decile %in% c("1", "2") ~ "1",
      poverty_nbi_decile %in% c("3", "4") ~ "2",
      poverty_nbi_decile %in% c("5", "6") ~ "3",
      poverty_nbi_decile %in% c("7", "8") ~ "4",
      poverty_nbi_decile %in% c("9", "10") ~ "5",
      TRUE ~ NA),
    poverty_nbi_median = case_when(
      poverty_nbi_decile %in% c("1", "2", "3", "4", "5") ~ "Below Median",
      poverty_nbi_decile %in% c("6", "7", "8", "9", "10") ~ "Above Median",
      TRUE ~ NA)
  ) %>% 
  mutate(
    poverty_nbi_decile = as_factor(poverty_nbi_decile),
    poverty_nbi_quintile = as_factor(poverty_nbi_quintile),
    poverty_nbi_median = as_factor(poverty_nbi_median)
  )

I plot those new poverty quintiles. Now, the number of AGEB in each is much more balanced (altough not equal since we bin by population). We see that it does not correspond to the quintiles of income (would be horizontal cuts on the plot).

ageb_attributes %>% 
  # scatter poverty_nbi vs income_std
  ggplot(aes(x = poverty_nbi, y = income_log, color = poverty_nbi_quintile)) +
  geom_point() +
  scale_color_brewer(palette = "RdYlBu", direction = -1, na.value = "grey90") +
  theme_minimal()

I join these attributes to the main panel

data <- 
  left_join(data, ageb_attributes)

Baseline regressions for reference

First the baseline regressions for reference:

baseline_linear <- data %>% 
  fepois(reports_dv ~ tmean |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

baseline_bins <- data %>% 
  mutate(bins_tmean_one = create_dynamic_bins(tmean, width = 1, 0.01)) %>% 
  fepois(reports_dv ~ i(bins_tmean_one, ref = "[16") |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

Regs with `pobreza` (categorical only)

Either temp enters linearly:

reg_pobreza_linear <- data %>% 
  fepois(reports_dv ~ i(pobreza, tmean) |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

reg_pobreza_linear %>% 
  etable() %>% 
  kable()

	.
Dependent Var.:	Reports for DV

Temperature x pobreza = [0,18]	0.0200*** (0.0039)
Temperature x pobreza = (18,34]	0.0334*** (0.0042)
Temperature x pobreza = (34,50]	0.0283*** (0.0043)
Temperature x pobreza = (50,70]	0.0329*** (0.0070)
Temperature x pobreza = (70,100]	0.0317 (0.0281)
Fixed-Effects:	——————
Prec, hum, wsp quintiles	Yes
Neighborhood	Yes
Year-Month	Yes
Day of Week	Yes
Day of Year	Yes
________________________________	__________________
S.E.: Clustered	by: Neighborhood
Observations	3,543,502

Or temp enters semi parametrically:

reg_pobreza_bins <- data %>% 
  mutate(bins_tmean_one = create_dynamic_bins(tmean, width = 1, 0.01)) %>% 
  fepois(reports_dv ~ i(pobreza, bins_tmean_one, ref2 = "[16") |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

I plot these results.

Error: object 'coefs_quintile' not found

Regs with `poverty_nbi` (continuous)

This variable is continuous, so I can use it as a continuous variable or bin it.

Gradient-style heterogeneity

Because we can, but not interpretable as such.

reg_poverty_cont_linear <- data %>% 
  fepois(reports_dv ~ tmean * poverty_nbi |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

etable(baseline_linear, reg_poverty_cont_linear) %>% 
  kable()

	baseline_linear	reg_poverty_cont_li..
Dependent Var.:	Reports for DV	Reports for DV

Temperature	0.0274*** (0.0033)	0.0077 (0.0063)
poverty_nbi		-1.3969 (41,209.5177)
Temperature x poverty_nbi		0.0288*** (0.0077)
Fixed-Effects:	——————	———————
Prec, hum, wsp quintiles	Yes	Yes
Neighborhood	Yes	Yes
Year-Month	Yes	Yes
Day of Week	Yes	Yes
Day of Year	Yes	Yes
_________________________	__________________	_____________________
S.E.: Clustered	by: Neighborhood	by: Neighborhood
Observations	3,624,826	3,576,634

N-tiles regressions

With the n-tiles I defined based on the continuous variable poverty_nbi.

Below and above median

First the simple linear version:

reg_above_below_linear <- data %>%
  fepois(reports_dv ~ i(poverty_nbi_median, tmean) |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

etable(baseline_linear, reg_above_below_linear) %>% 
  kable()

	baseline_linear	reg_above_below_..
Dependent Var.:	Reports for DV	Reports for DV

Temperature	0.0274*** (0.0033)
Temperature x poverty_nbi_median = BelowMedian		0.0232*** (0.0038)
Temperature x poverty_nbi_median = AboveMedian		0.0311*** (0.0037)
Fixed-Effects:	——————	——————
Prec, hum, wsp quintiles	Yes	Yes
Neighborhood	Yes	Yes
Year-Month	Yes	Yes
Day of Week	Yes	Yes
Day of Year	Yes	Yes
________________________________________	__________________	__________________
S.E.: Clustered	by: Neighborhood	by: Neighborhood
Observations	3,624,826	3,578,253

Or semi parametric with bins:

reg_above_below_bins <- data %>%
  mutate(bins_tmean_one = create_dynamic_bins(tmean, width = 1, 0.01)) %>% 
  fepois(reports_dv ~ i(poverty_nbi_median, bins_tmean_one, ref2 = "[16") |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

I can plot these results and save as reg_poverty_median.png

Quintile regressions

I first run the model imposing linearity on tmean and interacting it with poverty quintiles.

reg_quintile_linear <- data %>%
  fepois(reports_dv ~ i(poverty_nbi_quintile, tmean) |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)

etable(reg_quintile_linear) %>% 
  kable()

	reg_quintile_lin..
Dependent Var.:	Reports for DV

Temperature x poverty_nbi_quintile = 1	0.0149** (0.0048)
Temperature x poverty_nbi_quintile = 2	0.0274*** (0.0044)
Temperature x poverty_nbi_quintile = 3	0.0279*** (0.0046)
Temperature x poverty_nbi_quintile = 4	0.0303*** (0.0044)
Temperature x poverty_nbi_quintile = 5	0.0339*** (0.0047)
Fixed-Effects:	——————
Prec, hum, wsp quintiles	Yes
Neighborhood	Yes
Year-Month	Yes
Day of Week	Yes
Day of Year	Yes
______________________________________	__________________
S.E.: Clustered	by: Neighborhood
Observations	3,578,253

Export the table as reg_poverty_quintile.tex

I then allow for possible non-linearities in the relationship between temperature and DV. With the quintiles, its a bit of a crazy estimation…

tic("quintile")
reg_quintile_bins <- data %>%
  mutate(bins_tmean_one = create_dynamic_bins(tmean, width = 1, 0.01)) %>% 
  fepois(reports_dv ~ i(poverty_nbi_quintile, bins_tmean_one, ref2 = "[16") |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)
toc()

quintile: 953.265 sec elapsed

I plot these results.

I save the full results as reg_poverty_quintiles_bins_full.png.

And only the two extreme quintiles as reg_income_quintiles_bins_full.png

Decile

tic("decile")
reg_decile_linear <- data %>%
  fepois(reports_dv ~ i(poverty_nbi_decile, tmean) |
           prec_quintile + rh_quintile + wsp_quintile +
           ageb + year^month + day_of_week + day_of_year,
         cluster = ~ ageb)
toc()

decile: 231.703 sec elapsed

etable(reg_decile_linear) %>% 
  kable()

	reg_decile_linear
Dependent Var.:	Reports for DV

Temperature x poverty_nbi_decile = 1	0.0106. (0.0060)
Temperature x poverty_nbi_decile = 2	0.0188** (0.0061)
Temperature x poverty_nbi_decile = 3	0.0291*** (0.0053)
Temperature x poverty_nbi_decile = 4	0.0260*** (0.0055)
Temperature x poverty_nbi_decile = 5	0.0278*** (0.0058)
Temperature x poverty_nbi_decile = 6	0.0281*** (0.0058)
Temperature x poverty_nbi_decile = 7	0.0312*** (0.0056)
Temperature x poverty_nbi_decile = 8	0.0295*** (0.0055)
Temperature x poverty_nbi_decile = 9	0.0374*** (0.0060)
Temperature x poverty_nbi_decile = 10	0.0298*** (0.0060)
Fixed-Effects:	——————
Prec, hum, wsp quintiles	Yes
Neighborhood	Yes
Year-Month	Yes
Day of Week	Yes
Day of Year	Yes
_____________________________________	__________________
S.E.: Clustered	by: Neighborhood
Observations	3,578,253

Rendered on: 2025-04-04 18:56:19

Total render time: 3641.51 sec elapsed