How to compute and visualize the Standardized Precipitation Index (SPI) for drought analysis

Part 1: A step-by-step guide to calculate the SPI with the {SCI} package in R using data from Somalia at subnational level

data analysis

drought

tidyverse

ggplot2

box plots

SCI

Standardized Precipitation Index (SPI)

Somalia

Author

Juan Armando Torres Munguía

Published

February 8, 2025

Overview

The Standardized Precipitation Index (SPI) quantifies how precipitation deviates from the historical average over multiple time scales (e.g., from 1 to 12 months), allowing for consistent monitoring of droughts and floods across regions and time, making it a valuable tool for early warning and risk assessment systems.

In this tutorial, I will walk through the process of calculating the SPI for Somalia at the subnational level using the {SCI} package in R.

About the data

The rainfall dataset used in this tutorial comes from the Humanitarian Data Exchange (HDX) initiative of the United Nations Office for the Coordination of Humanitarian Affairs (OCHA) and provides monthly precipitation data for Somalia’s subnational districts. It is derived from CHIRPS v2 (Climate Hazards Group InfraRed Precipitation with in situ Station data), using satellite imagery combined with ground-based observations.

For each 10-day period (dekad), the dataset includes the following indicators:

rfh: Total rainfall (mm) over 10 days
r1h: 1-month rolling rainfall sum (mm)
r3h: 3-month rolling rainfall sum (mm)
rfh_avg: Long-term average rainfall for the dekad (mm)
r1h_avg: Long-term average for 1-month rolling rainfall (mm)
r3h_avg: Long-term average for 3-month rolling rainfall (mm)
rfq: Rainfall anomaly (%)
r1q: Rainfall anomaly (%)
r3q: 3-month rainfall anomaly (%)

The data is aggregated at the subnational level and follows administrative boundaries defined by the World Food Programme (WFP), with each unit assigned a unique code (Pcode).

Set-up

Before we begin, we need to install and load the necessary R packages. We will be using the {tidyverse} package for data wrangling, as well as the {SCI} package for calculating the SPI.

library(tidyverse) 
library(SCI) # For Standardized Precipitation Index calculations (climate analysis)
library(conflicted) # For resolving conflicts between function names in different packages
library(kableExtra) 

library(ggplot2) 
library(ggtext) 
library(showtext) 

conflicts_prefer(dplyr::select) # Prefer 'select' function from dplyr package
conflicts_prefer(dplyr::lag) # Prefer 'lag' function from dplyr package
conflicts_prefer(dplyr::filter) # Prefer 'filter' function from dplyr package

Warning

Important to use {conflicted} package to specify preferred functions to avoid naming conflicts between {tidyverse} and {SCI} packages.

Loading data HDX

Data can be found here in file som-rainfall-adm2-full.csv. For Somalia, data is available from January 1981.

# Read the rainfall data for Somalia`
# You can also download the data from the HDX platform and read it locally
rainfall_data <- read.csv("https://data.humdata.org/dataset/ed6e1b4b-8094-47e6-bdf7-f6d56fa7abb9/resource/331ab214-0860-44ea-a53b-0a3fac601e82/download/som-rainfall-adm2-full.csv")

Data wrangling

We preprocess the data to extract relevant information to calculate the SPI. We extract the year and month from the date column, calculate the rainfall anomaly, and group the data by year, month, and district_code.

rainfall_data <- rainfall_data |>
  slice(-1) |> # Remove the first row (metadata)
  mutate(
    year = year(date), # Extract year from the 'date' column
    month = month(date), # Extract month from the 'date' column
    district_code = ADM2_PCODE # Rename the 'ADM2_PCODE' column to 'district_code'
  ) |>
  filter(year < 2025) |> # Filter data up to 2024
  rename(
    rainfall_monthly_total = rfh, # Rename the column 'rfh' to 'rainfall_monthly_total'
    rainfall_monthly_lt_average = rfh_avg # Rename the column 'rfh_avg' to 'rainfall_monthly_lt_average'
  ) |>
  mutate(
    rainfall_monthly_total = as.numeric(rainfall_monthly_total), # Convert rainfall total to numeric
    rainfall_monthly_lt_average = as.numeric(rainfall_monthly_lt_average) # Convert long-term average to numeric
  ) |>
  select(year, month, district_code, rainfall_monthly_total, rainfall_monthly_lt_average) # Select relevant columns

# Group by year, month, and district_code, and calculate mean rainfall values by district
rainfall_data <- rainfall_data |>
  group_by(year, month, district_code) |>
  summarise(
    rainfall_monthly_total = mean(rainfall_monthly_total, na.rm = TRUE), # Calculate the mean of rainfall total
    rainfall_monthly_lt_average = mean(rainfall_monthly_lt_average, na.rm = TRUE) # Calculate the mean of long-term average
  ) |>
  ungroup() |>
  # Optionally, you can calculate rainfall anomaly as a percentage of the long-term average
  # rainfall_anomaly = 1, indicates no anomaly
  # < 1 indicates below average
  # > 1 indicates above average
  mutate(
    rainfall_anomaly = rainfall_monthly_total / rainfall_monthly_lt_average
  )

We then calculate the SPI for different time scales (1, 3, 6, 12 months).

rainfall_data <- rainfall_data |>
  group_by(district_code) |> # Group by district_code
  mutate(
    spi1 = transformSCI(
      rainfall_monthly_total,
      first.mon = 1, # Start month for SPI calculation
      obj = fitSCI(rainfall_monthly_total,
        first.mon = 1, # Start month for SPI calculation
        time.scale = 1, # Time scale for SPI calculation
        distr = "gamma", # Distribution for fitting
        p0 = TRUE # Use initial parameter estimates
      )
    ),
    spi3 = transformSCI(
      rainfall_monthly_total,
      first.mon = 1,
      obj = fitSCI(rainfall_monthly_total,
        first.mon = 1,
        time.scale = 3,
        distr = "gamma",
        p0 = TRUE
      )
    ),
    spi6 = transformSCI(
      rainfall_monthly_total,
      first.mon = 1,
      obj = fitSCI(rainfall_monthly_total,
        first.mon = 1,
        time.scale = 6,
        distr = "gamma",
        p0 = TRUE
      )
    ),
    spi12 = transformSCI(
      rainfall_monthly_total,
      first.mon = 1,
      obj = fitSCI(rainfall_monthly_total,
        first.mon = 1,
        time.scale = 12,
        distr = "gamma",
        p0 = TRUE
      )
    )
  ) |>
  ungroup() |>
  arrange(district_code)

The output table looks like this:

year	month	district_code	rainfall_monthly_total	rainfall_monthly_lt_average	rainfall_anomaly	spi1	spi3	spi6	spi12
1981	1	SO1101	0.9417333	1.744867	0.5397165	-1.1043897	NA	NA	NA
1981	2	SO1101	2.0194333	2.027633	0.9959559	0.1537179	NA	NA	NA
1981	3	SO1101	32.8349667	9.696967	3.3861070	2.0618329	2.0528624	NA	NA
1981	4	SO1101	18.3139000	24.722233	0.7407866	-0.3498790	0.9691425	NA	NA
1981	5	SO1101	6.7928667	17.754700	0.3825954	-1.1682135	0.3442811	NA	NA
1981	6	SO1101	2.7443667	4.763233	0.5761563	-1.1812491	-1.5029868	0.1547485	NA
1981	7	SO1101	17.3592333	14.047767	1.2357290	0.7686622	-0.9047071	0.3837349	NA
1981	8	SO1101	34.8381667	26.758133	1.3019655	0.8281642	0.7648521	0.7381726	NA
1981	9	SO1101	29.1683000	19.003000	1.5349313	1.2521209	1.2301335	0.0193033	NA
1981	10	SO1101	4.1974333	8.226200	0.5102518	-0.3816687	0.7653220	0.1559546	NA

Interpretation

Drought events are indicated by negative SPI values, with more negative values indicating more severe drought conditions. Conversely, positive SPI values indicate wetter-than-average conditions. For demonstrating purposes, let’s visualize the SPI values for Laasqoray district in Sanaag region. One efficient way to visualize SPI values is by using a box plot with SPI categories.

font_add_google("Roboto Condensed", "Roboto Condensed")

showtext_auto()

# Custom theme for the chart
theme_boxplot_chart <- function() {
  theme_minimal(
    base_family = "Roboto Condensed" 
  ) +
    # Custom theme settings
    theme(
      # Axis settings
      axis.title = element_text(
        color = "grey10",
        face = "bold",
        size = 14
      ),
      axis.text = element_text(
        color = "grey10",
        face = "bold",
        size = 12
      ),

      # Title settings
      plot.title.position = "plot", 
      plot.title = element_textbox(
        color = "black",
        face = "bold",
        size = 24,
        margin = margin(5, 0, 5, 0), 
        width = unit(1, "npc") 
      ),

      # Legend settings
      legend.position = "top",
      legend.title = element_blank(),
      legend.key.height = unit(0.5, "cm"),
      legend.key.width = unit(0.5, "cm"),
      legend.spacing.x = unit(0.2, "cm"),
      legend.key.spacing = unit(0.5, "cm"), 
      legend.text = element_text(
        margin = margin(5, 0, 5, 0),
        face = "bold",
        color = "grey10",
        size = 10
      ),
      legend.direction = "horizontal",
      legend.byrow = FALSE,

      # Caption settings
      plot.caption = element_markdown(
        color = "grey70",
        face = "italic",
        size = 12,
        hjust = 0
        ),
      plot.background = element_rect(
        color = "white",
        fill = "white"
      ),
      plot.margin = margin(20, 40, 20, 40)
    )
}

title_chart <- "1-month Standardized Precipitation Index (SPI-1) in Laasqoray, Somalia for the period 1981-2024"
caption_chart <- paste0("**Data:** Own estimations using Somalia rainfall Indicators at Subnational Level (HDX)",
                        "<br>", 
                        "**Graphic:** Juan Torres Munguía")

#
spi_colors <- data.frame(
  ymin = c(-5, -2, -1.5, -1, 0, 1, 1.5, 2),
  ymax = c(-2, -1.5, -1, 0, 1, 1.5, 2, 5),
  label = factor(
    c(
      "Extremely dry", "Severely dry", "Moderately dry", "Near normal",
      "Near normal", "Moderately wet", "Severely wet", "Extremely wet"
    ),
    levels = c(
      "Extremely dry", "Severely dry", "Moderately dry", "Near normal",
      "Moderately wet", "Severely wet", "Extremely wet"
    )
  ),
  fill = c(
    "#F76D5E", "#FFAD72", "#FFE099", "#99FF99",
    "#99FF99", "#99EAFF", "#75D3FF", "#3D87FF"
  )
)

rainfall_data |>
  filter(district_code == "SO1503") |>
  mutate(month = factor(month, levels = 1:12, labels = month.name)) |> # Convert month to factor
  ggplot(aes(x = month, y = spi1)) +

  # Add box plot with SPI categories
  geom_rect(
    data = spi_colors, aes(ymin = ymin, ymax = ymax, xmin = -Inf, xmax = Inf, fill = label),
    alpha = 0.2, inherit.aes = FALSE
  ) +
  geom_boxplot(size = 1, outlier.size = 2, outlier.shape = 16) +
  scale_fill_manual(values = setNames(spi_colors$fill, spi_colors$label)) +
  labs(
    title = title_chart,
    caption = caption_chart,
    x = "",
    y = "SPI-1",
    fill = ""
  ) +
  guides(fill = guide_legend(nrow = 1)) + # Number of rows in the legend
  theme_boxplot_chart()

showtext_opts(dpi = 320) # Set the resolution of the image 320 dpi is for high-quality images ("retina")
ggsave(
  "spi-somalia.png",
  dpi = 320,
  width = 12,
  height = 9,
  units = "in"
)
showtext_auto(FALSE) # Turn off the showtext functionality

Citation

BibTeX citation:

@online{torres munguía2025,
  author = {Torres Munguía, Juan Armando},
  title = {How to Compute and Visualize the {Standardized}
    {Precipitation} {Index} {(SPI)} for Drought Analysis},
  date = {2025-02-08},
  langid = {en}
}

For attribution, please cite this work as:

Torres Munguía, Juan Armando. 2025. “How to Compute and Visualize the Standardized Precipitation Index (SPI) for Drought Analysis.” February 8, 2025.