Financial
A financial analysis would be the most straightforward method to
examine this data set. However, since our current exploration centers on
coverage, we will only touch upon several key points:
We must take into account the nature of self-reporting, where a field
manager reports on their intended actions. There may be discrepancies
between what is initially reported and the actual outcomes when the
transfers are executed. To perform a more precise financial analysis, it
would be more appropriate to consult more reliable sources, collected
a posteriori such as Post Distribution Monitoring reports.
These provide data that are derived directly from field observations and
transfer records.
In this analysis I will be focusing on coverage aspects, aiming to
shed light on the operational intentions of these agencies without
delving deeply into the complexities of their actual financial
transactions or the variability in outcomes.
To prove this, let’s explore the surface of the finantial data in
this dataset:
# Summarize the total cash disbursed by ORGANIZATION_TYPE_REPORTING
cash_summary <- mpca %>%
group_by(ORGANIZATION_TYEP_REPORTING) %>%
summarise(
Total_Cash_Disbursed = sum(TOTAL_AMOUNT_OF_CASH_DISBURSED, na.rm = TRUE) # Summing while ignoring NA values
) %>%
arrange(desc(Total_Cash_Disbursed)) # Sort by total cash disbursed in descending order
# Create and style the table
cash_summary %>%
kable(
caption = "Total Cash Disbursed by Organization Type Reporting",
col.names = c("Organization Type", "Total Cash Disbursed ($)"),
digits = 2,
format.args = list(big.mark = ",") # Add thousand separators
) %>%
kable_styling(
bootstrap_options = c("striped", "hover", "condensed", "responsive"),
full_width = FALSE,
position = "center"
) %>%
row_spec(0, bold = TRUE, background = "#D3D3D3") # Highlight header row
Total Cash Disbursed by Organization Type Reporting
|
Organization Type
|
Total Cash Disbursed ($)
|
|
UN Agency
|
1,016,982,879
|
|
International NGO
|
160,018,364
|
|
National NGO
|
23,121,326
|
|
RCRC Movement
|
9,538,335
|
What is happeing here? Did we disboursed 1.2 Billion
of dollars in cash transfers during 2022? Not really, although the
amount of transfers for one emergency response was historical in terms
of volume (some figures below), there is something else happening
here.
The high number of outliers (1,610 records) suggest that (mot likely)
some reporting has been made in UHA and other in USD/EUR. This is not
unheard of these kinf of responses, the primary objective in this
dataset is field coordination, so validation of right locations
and covered beneficiaries has a higher priority than cash
disboursment.
These are some of the publicly reported figures for the major UN
agencies:
The World Food Programme (WFP) distributed roughly
$375.6 million in cash-based transfers, benefiting approximately 2.3
million individuals. Source: WFP Executive Board.
The United Nations High Commissioner for Refugees
(UNHCR) provided $202 million in both multi-purpose and
protection-specific cash assistance to almost 500,000 refugees. Source:
UNHCR Reporting.
While specific figures for the International Organization for
Migration (IOM)’s cash transfers in 2022 are not detailed in available
sources, IOM has been actively engaged in delivering cash-based
assistance to conflict-affected individuals across Ukraine.
The United Nations Children’s Fund (UNICEF)
initiated the ‘Spilno’ cash assistance program in March 2022, in
partnership with Ukraine’s Ministry of Social Policy. By August 2022,
this initiative had distributed approximately $125 million, supporting
over 350,000 children, including 35,000 with disabilities, across
120,000 households. Source: UNICEF.
# Calculate basic statistics
transfer_stats <- mpca %>%
summarise(
Mean = mean(TRANSFER_VALUE_PER_HOUSEHOLD, na.rm = TRUE),
Median = median(TRANSFER_VALUE_PER_HOUSEHOLD, na.rm = TRUE),
Min = min(TRANSFER_VALUE_PER_HOUSEHOLD, na.rm = TRUE),
Max = max(TRANSFER_VALUE_PER_HOUSEHOLD, na.rm = TRUE),
SD = sd(TRANSFER_VALUE_PER_HOUSEHOLD, na.rm = TRUE),
Missing = sum(is.na(TRANSFER_VALUE_PER_HOUSEHOLD))
)
# View the statistics
transfer_stats
# A tibble: 1 × 6
Mean Median Min Max SD Missing
<dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 94.8 68 30.6 2220 239. 563
We could probably identify these outliers and infer the most likely
reporting intention. But again, it would be easier and more precise to
use other sources for this kind of analysis.
Demographics
One of the principles to identify gender imbalances dynamics is the
collection of Sex and Age Disegregated Data (SADD). And although methodologies
can slightly differ, the main principle is that we differentiate
population groups in a meaningful way to allow the targeting of specific
vulnerabilities.
In this case, data was disaggregated in an unusual way, and for the
category of people beyond than 60 years old, there was no distinction
between men and women. For this reason, I am presenting senior citizens
and PwD separately.
Additionally, since total population figures are hard to come by, due
to the Internal Displacement and lack of recent census information, it
has not been possible to relate these figures with the total population,
and important limitation to take into account when reading these
tables.
# Define columns for Senior Citizens and PwD
special_categories <- c("ELDERLY_60", "PEOPLE_WITH_DISABILITIES")
# Convert these columns to numeric, handling "Not Collected"
mpca[special_categories] <- lapply(mpca[special_categories], function(x) {
as.numeric(gsub("Not Collected", NA, x))
})
# Summarize totals (sum) for these categories
special_totals <- colSums(mpca[special_categories], na.rm = TRUE)
special_data <- data.frame(
Category = c("Senior Citizens (+60)", "PwD"), # Custom labels
Total = special_totals
)
# Assign specific colors
custom_colors <- c("Senior Citizens (+60)" = "#F8F2BB", "PwD" = "#FCD1CA")
# Create the enhanced bar chart
ggplot(special_data, aes(x = Category, y = Total, fill = Category)) +
geom_bar(stat = "identity", color = "black", width = 0.5, show.legend = FALSE) +
geom_text(aes(label = scales::comma(Total)), vjust = -0.5, size = 5, fontface = "bold") + # Add bold totals
labs(
title = "Sum of Senior Citizens (+60) and PwD",
subtitle = "Representation of specific demographic groups",
x = "Category",
y = "Total Sum"
) +
theme_light(base_size = 14) + # Use a light theme for better aesthetics
scale_fill_manual(values = custom_colors) +
scale_y_continuous(labels = scales::comma, expand = expansion(mult = c(0, 0.1))) + # Add space above bars
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
plot.subtitle = element_text(size = 12, hjust = 0.5),
plot.caption = element_text(size = 10, face = "italic", hjust = 1),
axis.text.x = element_text(size = 12, face = "bold"),
axis.text.y = element_text(size = 12)
)

# Define demographic columns
demographic_columns <- c("BOYS_0_17_YEARS", "GIRLS_0_17_YEARS",
"WOMEN_18_59_YEARS", "MEN_18_59_YEARS")
# Convert demographic columns to numeric, handling "Not Collected"
mpca[demographic_columns] <- lapply(mpca[demographic_columns], function(x) {
as.numeric(gsub("Not Collected", NA, x))
})
# Summarize totals for each category
demographic_totals <- colSums(mpca[demographic_columns], na.rm = TRUE)
demographic_data <- data.frame(
Category = c("Boys (0-17)", "Girls (0-17)", "Women (18-59)", "Men (18-59)"),
Total = demographic_totals
)
# Create positive and negative values for pyramid
demographic_data <- demographic_data %>%
mutate(Direction = case_when(
grepl("Girls|Women", Category) ~ "Female",
TRUE ~ "Male"
),
Value = ifelse(Direction == "Female", Total, -Total))
# Custom colors
custom_colors <- c("Female" = "#8600fc", "Male" = "#00C4AC")
# Create the pyramid plot
ggplot(demographic_data, aes(x = Category, y = Value, fill = Direction)) +
geom_bar(stat = "identity", color = "black", width = 0.5) +
geom_text(aes(label = scales::comma(abs(Value))), vjust = 0.5, hjust = ifelse(demographic_data$Value > 0, -0.1, 1.1),
size = 5, fontface = "bold") + # Add labels within the bars
coord_flip() +
scale_y_continuous(labels = abs, expand = expansion(mult = c(0.1, 0.1))) +
scale_fill_manual(values = custom_colors) +
labs(
title = "Demographic Pyramid: Men, Women, Boys, and Girls",
subtitle = "Total representation of age and gender groups",
x = "Demographic Categories",
y = "Population (Sum)"
) +
theme_light(base_size = 14) +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
plot.subtitle = element_text(size = 12, hjust = 0.5),
plot.caption = element_text(size = 10, face = "italic", hjust = 1),
axis.text.x = element_text(size = 12),
axis.text.y = element_text(size = 12, face = "bold"),
legend.position = "top",
legend.title = element_blank()
)

# Assign colors for gender
row_colors <- c("#8600fc", "#00C4AC") # Purple for Female, Green for Male
# Create and style the table
demographic_data %>%
kable("html", align = "c") %>%
kable_styling("striped", full_width = F, font_size = 16) %>%
row_spec(1, background = row_colors[1], color = "white") %>% # Girls
row_spec(2, background = row_colors[2]) %>% # Boys
row_spec(3, background = row_colors[1], color = "white") %>% # Women
row_spec(4, background = row_colors[2]) # Men
|
|
Category
|
Total
|
Direction
|
Value
|
|
BOYS_0_17_YEARS
|
Boys (0-17)
|
233888.7
|
Male
|
-233888.7
|
|
GIRLS_0_17_YEARS
|
Girls (0-17)
|
233901.8
|
Female
|
233901.8
|
|
WOMEN_18_59_YEARS
|
Women (18-59)
|
464087.0
|
Female
|
464087.0
|
|
MEN_18_59_YEARS
|
Men (18-59)
|
254813.4
|
Male
|
-254813.4
|
Although we are still observing absolute figures, there is a clear
reduction in the number of adult men. This is particularly striking,
considering that most
of the refugee population consists of women and children. This trend
might suggest voluntary underreporting by families who fear recruitment.
Since the transfer amount is calculated based on the number of family
members, this could mean that larger families are receiving less aid
than initially intended. This would be an interesting line for further
research.
Geographical
analysis
# Summarize total population by oblast
oblast_summary <- mpca %>%
group_by(OBLAST) %>% # Group by oblast
summarise(Total_Population = sum(across(all_of(demographic_columns)), na.rm = TRUE)) %>%
mutate(Percentage = (Total_Population / sum(Total_Population)) * 100) %>% # Calculate percentage
arrange(desc(Total_Population)) # Arrange by total population in descending order
# Create and style the table
oblast_summary %>%
mutate(
Total_Population = scales::comma(Total_Population), # Format totals with commas
Percentage = scales::percent(Percentage / 100, accuracy = 0.1) # Format percentages
) %>%
kable("html", col.names = c("Oblast", "Total Population", "Percentage"), align = "c") %>%
kable_styling("striped", full_width = F, font_size = 16)
|
Oblast
|
Total Population
|
Percentage
|
|
Dnipropetrovska
|
137,043
|
11.5%
|
|
Lvivska
|
119,173
|
10.0%
|
|
Vinnytska
|
86,944
|
7.3%
|
|
Zakarpatska
|
83,831
|
7.1%
|
|
Zaporizka
|
75,750
|
6.4%
|
|
Kharkivska
|
71,099
|
6.0%
|
|
Khmelnytska
|
61,492
|
5.2%
|
|
Chernivetska
|
60,782
|
5.1%
|
|
Poltavska
|
51,586
|
4.3%
|
|
Donetska
|
48,742
|
4.1%
|
|
Odeska
|
46,637
|
3.9%
|
|
Ivano-Frankivska
|
40,694
|
3.4%
|
|
Mykolaivska
|
39,621
|
3.3%
|
|
Kirovohradska
|
38,489
|
3.2%
|
|
Sumska
|
33,613
|
2.8%
|
|
Ternopilska
|
30,485
|
2.6%
|
|
Kyivska
|
30,203
|
2.5%
|
|
Cherkaska
|
27,714
|
2.3%
|
|
Zhytomyrska
|
21,159
|
1.8%
|
|
Chernihivska
|
20,773
|
1.8%
|
|
Kyiv
|
18,385
|
1.5%
|
|
Khersonska
|
16,377
|
1.4%
|
|
Volynska
|
9,503
|
0.8%
|
|
Luhanska
|
8,674
|
0.7%
|
|
Rivnenska
|
7,374
|
0.6%
|
|
NA
|
511
|
0.0%
|
|
Sevastopilska
|
24
|
0.0%
|
|
Avtonomna Respublika Krym
|
13
|
0.0%
|
# Define the desired oblast order (right-to-left)
oblast_order <- c(
"Luhanska", "Donetska", "Kharkivska", "Zaporizka", "Dnipropetrovska",
"Khersonska", "Mykolaivska", "Odeska", "Poltavska", "Kirovohradska",
"Cherkaska", "Vinnytska", "Sumska", "Chernihivska", "Kyivska",
"Zhytomyrska", "Rivnenska", "Volynska", "Khmelnytska", "Ternopilska",
"Chernivetska", "Ivano-Frankivska", "Lvivska", "Zakarpatska"
)
# Summarize total population by oblast
oblast_summary <- mpca %>%
group_by(OBLAST) %>%
summarise(Total_Population = sum(across(all_of(demographic_columns)), na.rm = TRUE)) %>%
filter(OBLAST %in% oblast_order) %>% # Keep only oblasts in the order list
mutate(OBLAST = factor(OBLAST, levels = oblast_order)) # Set factor levels to match the order
# Create the bar chart
ggplot(oblast_summary, aes(x = OBLAST, y = Total_Population, fill = OBLAST)) +
geom_bar(stat = "identity", color = "black", show.legend = FALSE) +
labs(
title = "Population Distribution by Oblast",
x = "Oblast (Right to Left)",
y = "Total Population"
) +
scale_y_continuous(labels = scales::comma) +
scale_x_discrete(limits = rev(oblast_order)) + # Reverse order for right-to-left display
theme_minimal(base_size = 14) +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
axis.text.x = element_text(angle = 45, hjust = 1)
)

Now let’s analyze the distribution of individual beneficiaries by
Oblast, which is the highest administrative division under the State.
The oblasts are arranged based on their geographical position in the
country, from West to East, left to right, as they approach the
frontline. The number of beneficiaries in oblasts farther from the
frontline is comparable to those closer to it. Could this be because
people relocated to the western border? While this might have been true
at the beginning of the war, this data encompasses all of 2022. Another
plausible (and somewhat problematic) explanation is that many
humanitarian organizations established themselves in the West (notably
in the city of Lviv), reaching more people there than the relatively few
organizations that deployed to the East.
# Summarize total population by oblast and organization type
oblast_summary <- mpca %>%
group_by(OBLAST, ORGANIZATION_TYEP_REPORTING) %>% # Include organization type in the grouping
summarise(Total_Population = sum(across(all_of(demographic_columns)), na.rm = TRUE)) %>%
filter(OBLAST %in% oblast_order) %>% # Keep only oblasts in the order list
mutate(OBLAST = factor(OBLAST, levels = oblast_order)) # Set factor levels to match the order
# Create the stacked bar chart
ggplot(oblast_summary, aes(x = OBLAST, y = Total_Population, fill = ORGANIZATION_TYEP_REPORTING)) +
geom_bar(stat = "identity", color = "black") +
labs(
title = "Population Distribution by Oblast and Organization Type",
x = "Oblast (East to West)",
y = "Total Population",
fill = "Organization Type"
) +
scale_y_continuous(labels = scales::comma) +
scale_x_discrete(limits = rev(oblast_order)) + # Reverse order for right-to-left display
theme_minimal(base_size = 14) +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
axis.text.x = element_text(angle = 45, hjust = 1)
)

By type of organization, we can observe that a significant portion of
those reached in the west were assisted by UN agencies. This could be
attributed to the stricter security constraints faced by their staff
compared to NGOs.
# Load additional libraries for arrow annotation
library(grid)
# Create the stacked bar chart
ggplot(oblast_summary, aes(x = OBLAST, y = Total_Population, fill = ORGANIZATION_TYEP_REPORTING)) +
geom_bar(stat = "identity", color = "black") +
labs(
title = "Population Distribution by Oblast and Organization Type",
subtitle = "Oblast shown in order of approximation to the frontline ",
x = "Oblast (from West to East)",
y = "Total Population",
fill = "Organization Type"
) +
scale_y_continuous(labels = scales::comma) +
scale_x_discrete(limits = rev(oblast_order)) + # Reverse order for right-to-left display
theme_minimal(base_size = 14) +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
axis.text.x = element_text(angle = 45, hjust = 1),
plot.margin = margin(t = 40, r = 20, b = 20, l = 20) # Adjust margin for extra space
) +
# Add an arrow annotation
annotation_custom(
grob = grid::linesGrob(arrow = arrow(type = "closed", length = unit(0.25, "inches")),
gp = gpar(col = "black", lwd = 2)),
ymin = max(oblast_summary$Total_Population) * 1.05, ymax = max(oblast_summary$Total_Population) * 1.05,
xmin = 1, xmax = 24 # Control arrow position
) +
# Add text for "Frontline"
annotate("text", x = 23, y = max(oblast_summary$Total_Population) * 1.1, label = "Frontline",
size = 5, hjust = 0, fontface = "bold")

# Define the Oblasts of interest
oblsts_of_interest <- c("Donetska", "Lvivska", "Zakarpatska", "Kharkivska") # Replace "Dnipropetrovska" with "Donetska"
# Filter and analyze Raion coverage by Oblast
raion_coverage_by_oblast <- mpca %>%
filter(OBLAST %in% oblsts_of_interest, !is.na(RAION)) %>% # Filter for selected Oblasts and valid Raions
group_by(OBLAST) %>% # Group by Oblast
summarise(raions_covered = n_distinct(RAION)) # Count unique Raions per Oblast
# Print the results
print(raion_coverage_by_oblast)
# A tibble: 4 × 2
OBLAST raions_covered
<fct> <int>
1 Donetska 8
2 Kharkivska 7
3 Lvivska 7
4 Zakarpatska 7
# Visualization with proper Oblast order and sorting within each chart
ggplot(data = mpca %>%
filter(OBLAST %in% c("Lvivska", "Kharkivska", "Zakarpatska", "Odeska"), !is.na(RAION)) %>%
mutate(
OBLAST = factor(OBLAST, levels = c("Lvivska", "Kharkivska", "Zakarpatska", "Odeska")), # Correct Oblast order
RAION = reorder(RAION, -NUMBER_OF_BENEFICIARIES_INDV) # Sort Raions by number of beneficiaries
),
aes(x = RAION, y = NUMBER_OF_BENEFICIARIES_INDV, fill = OBLAST)) +
geom_bar(stat = "identity", show.legend = FALSE) + # Bar chart without labels
facet_wrap(~ OBLAST, scales = "free", ncol = 2) + # Separate facets by Oblast
coord_flip() + # Flip axes for better readability
scale_y_continuous(labels = scales::comma, expand = expansion(mult = c(0, 0.1))) +
labs(
title = "Number of Beneficiaries Reached per Raion",
subtitle = "Comparing coverage in Lvivska, Kharkivska, Zakarpatska, and Odeska Oblasts",
x = "Raion",
y = "Number of Beneficiaries"
) +
theme_minimal(base_size = 14) +
theme(
strip.text = element_text(face = "bold", size = 14),
axis.text.x = element_text(size = 12),
axis.text.y = element_text(size = 10),
plot.title = element_text(face = "bold", size = 16),
plot.subtitle = element_text(size = 12, face = "italic")
) +
scale_fill_brewer(palette = "Set2")

On the left, we see two oblasts close to the frontline, while on the
right, there are two oblasts farther away. Both show a similar
distribution, with more beneficiaries reached in areas less affected by
the hostilities.
# Rural/Urban Labels and Formatting
rural_urban_summary <- mpca %>%
filter(!is.na(FINAL_RURAL_URBAN)) %>%
mutate(
FINAL_RURAL_URBAN = case_when(
FINAL_RURAL_URBAN == "rural/ сільський" ~ "Rural",
FINAL_RURAL_URBAN == "urban / міський" ~ "Urban",
TRUE ~ FINAL_RURAL_URBAN
)
) %>%
group_by(FINAL_RURAL_URBAN) %>%
summarise(
`# of Individuals` = sum(NUMBER_OF_BENEFICIARIES_INDV, na.rm = TRUE),
`# of Actions` = n()
) %>%
mutate(
`# of Individuals` = scales::comma(`# of Individuals`),
`# of Actions` = scales::comma(`# of Actions`)
)
# Display the table with kable
rural_urban_summary %>%
kbl(caption = "Summary of Beneficiaries by Rural/Urban Classification") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
full_width = FALSE, font_size = 12) %>%
column_spec(1, bold = TRUE) %>% # Highlight the Rural/Urban column
add_header_above(c(" " = 1, "Beneficiaries Data" = 2)) # Add a header above the numeric columns
Summary of Beneficiaries by Rural/Urban Classification
|
|
Beneficiaries Data
|
|
FINAL_RURAL_URBAN
|
# of Individuals
|
# of Actions
|
|
Rural
|
298,551
|
18,752
|
|
Urban
|
222,316
|
7,540
|
A significant number of actions in rural areas reflects a strong
commitment to reaching the most vulnerable populations. These efforts
are logistically more challenging and typically serve fewer people per
action, making them more resource- and energy-intensive.
# Calculate percentages for urban and rural beneficiaries and actions
organization_rural_urban_summary <- mpca %>%
filter(!is.na(FINAL_RURAL_URBAN), !is.na(ORGANIZATION_TYEP_REPORTING)) %>% # Exclude missing data
mutate(
FINAL_RURAL_URBAN = case_when(
FINAL_RURAL_URBAN == "rural/ сільський" ~ "Rural",
FINAL_RURAL_URBAN == "urban / міський" ~ "Urban",
TRUE ~ FINAL_RURAL_URBAN
)
) %>%
group_by(ORGANIZATION_TYEP_REPORTING, FINAL_RURAL_URBAN) %>%
summarise(
Total_Beneficiaries = sum(NUMBER_OF_BENEFICIARIES_INDV, na.rm = TRUE),
Total_Actions = n()
) %>%
group_by(ORGANIZATION_TYEP_REPORTING) %>%
mutate(
`% of Beneficiaries` = round(100 * Total_Beneficiaries / sum(Total_Beneficiaries), 2),
`% of Actions` = round(100 * Total_Actions / sum(Total_Actions), 2)
) %>%
ungroup()
# Prepare the table for display (only percentages)
organization_rural_urban_summary %>%
select(
`Organization Type` = ORGANIZATION_TYEP_REPORTING,
`Rural/Urban` = FINAL_RURAL_URBAN,
`% of Beneficiaries`,
`% of Actions`
) %>%
kbl(caption = "Percentage of Urban and Rural Beneficiaries and Actions per Organization Type") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
full_width = FALSE, font_size = 12) %>%
column_spec(1, bold = TRUE) %>%
add_header_above(c(" " = 2, "Percentages" = 2))
Percentage of Urban and Rural Beneficiaries and Actions per Organization
Type
|
|
Percentages
|
|
Organization Type
|
Rural/Urban
|
% of Beneficiaries
|
% of Actions
|
|
International NGO
|
Rural
|
59.79
|
71.46
|
|
International NGO
|
Urban
|
40.21
|
28.54
|
|
National NGO
|
Rural
|
37.04
|
64.14
|
|
National NGO
|
Urban
|
62.96
|
35.86
|
|
RCRC Movement
|
Rural
|
91.71
|
95.00
|
|
RCRC Movement
|
Urban
|
8.29
|
5.00
|
|
UN Agency
|
Rural
|
55.09
|
38.46
|
|
UN Agency
|
Urban
|
44.91
|
61.54
|
# Remove NA and summarize for pie chart
rural_urban_total <- mpca %>%
filter(!is.na(FINAL_RURAL_URBAN)) %>%
group_by(FINAL_RURAL_URBAN) %>%
summarise(Total_Beneficiaries = sum(NUMBER_OF_BENEFICIARIES_INDV, na.rm = TRUE))
# Pie chart with numbers displayed
ggplot(rural_urban_total, aes(x = "", y = Total_Beneficiaries, fill = FINAL_RURAL_URBAN)) +
geom_bar(stat = "identity", width = 1, color = "white") +
coord_polar(theta = "y") +
scale_fill_brewer(palette = "Set2") +
labs(
title = "Overall Rural vs Urban Distribution",
x = NULL,
y = NULL,
fill = "Rural/Urban"
) +
geom_text(aes(label = scales::comma(Total_Beneficiaries)),
position = position_stack(vjust = 0.5),
size = 5) +
theme_void() +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
legend.position = "top"
)

The depth of this analysis is intentionally made at the descriptive
level. To use inferential statistics or predictive modeling (while an
interesting rehearsal) would have been of very limited value, given the
nature of this data collection, which was not conceived for research.
Once again, let’s acknowledge the great effort put on the collection and
processing of the data made by the MPCA technical working group.