Background

View Github Repo

This analysis explores how the recently proposed restructuring of the National Institutes of Health (NIH) fits into decades of historical funding patterns. In March 2025, HHS unveiled the “Make America Healthy Again” reorganization, which collapses the 27 existing NIH institutes and centers into eight new institutes and realigns several functions.

Details of the budget numbers were released on May 30, 2025, as part of the HHS FY 2026 Budget in Brief.

So with the help of ChatGPT, I asked: How do those new groupings compare to NIH’s budget trajectory back to FY 2000?

To be clear, I am not an expert in NIH funding or the budget process. I am a scientist interested in how the NIH budget has changed over time, and how the new reorganization plan fits into that historical context. There were several ambiguities in the budget document, and I used ChatGPT to help me parse the data. Please do let me know if you find any major errors or have suggestions for improvement.


Data Preparation

# Read budgets and mapping
budgets <- read.csv("budgets.csv", stringsAsFactors = FALSE)
mapping <- read.csv("mapping.csv", stringsAsFactors = FALSE)

# Separate historical vs. 2026 totals
hist_budgets <- budgets %>% filter(year < 2026)
new_totals   <- budgets %>% filter(year == 2026) %>%
  rename(new_ic = ic, total_2026 = budget) %>%
  select(new_ic, total_2026)

# Tag each old IC with its new institute
hist_joined <- hist_budgets %>%
  left_join(mapping, by = c("ic" = "old_ic")) %>%
  filter(!is.na(new_ic))

# Slice 2024 to compute group sums
slice_2024 <- hist_joined %>%
  filter(year == 2024) %>%
  select(ic, budget_2024 = budget, new_ic)

group_sums_2024 <- slice_2024 %>%
  group_by(new_ic) %>%
  summarize(group_sum_2024 = sum(budget_2024), .groups = "drop")

if (!"Eliminated" %in% new_totals$new_ic) {
  new_totals <- bind_rows(
    new_totals,
    tibble(new_ic = "Eliminated", total_2026 = group_sums_2024$group_sum_2024[group_sums_2024$new_ic == "Eliminated"])
  )
}

Data was collected from the NIH Almanac and the HHS FY 2026 Budget in Brief by ChatGPT in a CSV file. I have manually spot-checked about a dozen or so of the ~300 data points, and they seem to be correct.

I also asked it to create a mapping table for old-to-new IC designations. ChatGPT was surprisingly bad at this, though understandably so. For example, it had assigned NIDCR (National Institute for Dental and Craniofacial Research) to the new “Body Systems” institute, which is not what is outlined in the budget document. I had to manually correct this and a few other misassignments. The mapping table is included in the repo as mapping.csv.

The historical budgets from 2000 to 2024 were joined with a mapping of old ICs to new institutes proposed for 2026. The 2024 budget was sliced to compute group sums for each new institute.

Apportionment for FY 2026 Totals

Since the 2026 budget numbers are only reported for the new IC designations, we need to estimate how much each old IC would contribute to the new institute’s budget. The simplest assumption is that the 2026 budget for each new institute is proportionally allocated based on the 2024 budget of its old IC members. That is, the old ICs get the same relative share of the new institute’s budget in 2026 as they did in 2024.

Practically, we do this by taking the 2024 budget for each old IC, dividing it by the total budget for that new institute in 2024, and multiplying it by the total budget for that new institute in 2026.

Adjusting for inflation

To adjust historical budgets for inflation, we will use BRDPI (Biomedical Research and Development Price Index) provided by the NIH Office of Budget. This index allows us to convert historical budget numbers into real terms relative to a base year (2024 in this case). Indices were extracted from this PDF.

BRDPI adjusts not only for overall inflation, but specifically for things relevant to the NIH budget. It tracks “prices paid for the labor, supplies, equipment, and other inputs required to perform the biomedical research the NIH supports in its intramural laboratories and through its awards to extramural organizations.” Source: Bureau of Economic Analysis (BEA)

# 1) BRDPI lookup table
## Using numbers from the first column of the BRDPI table provided by NIH Office of Budget:
## https://officeofbudget.od.nih.gov/pdfs/FY24/GDP/BRDPI%20Price%20Index%20Annual%20and%20Cumulative%20Values_For_1950_Through_2029.pdf
brdpi <- tibble::tribble(
  ~year,   ~brdpi,
   2000L,   977.7,
   2010L,  1403.7,
   2020L,  1733.4,
   2024L,  1994.1,
   2026L,  2101.5
)

# 2) Base index (2024)
base_idx <- brdpi %>% filter(year==2026) %>% pull(brdpi)

# 3) Merge & compute real budgets
plot_data_real <- plot_data %>%
  # turn the factor‐year ("x") into an integer so we can join
  mutate(year = as.integer(as.character(x))) %>%
  left_join(brdpi, by="year") %>%
  mutate(
    weight_real = weight * (base_idx / brdpi)
  )

Alluvial Plots

Budget plots

These plot show the budget amounts in nominal or adjusted dollars in separate tabs. The x-axis represents fiscal years, while the y-axis shows the budget amounts in millions of USD. The colors represent the new institutes proposed for 2026, and the flows show how budgets from old ICs are redistributed into these new institutes.

Inflation-adjusted Budget Numbers

Note that the x-axis is not linear, as it represents discrete fiscal years.

palette <- c(
  "NCI"                    = "#377EB8",
  "Body Systems"           = "#E41A1C",
  "Neuroscience & Brain"   = "#4DAF4A",
  "NIAID"                  = "#984EA3",
  "GMS"                    = "#FF7F00",
  "Child & Women’s Health" = "#A65628",
  "NIA"                    = "#F781BF",
  "Behavioral Health"      = "#999999",
  "Office of the Director" = "#8DD3C7",
  "Eliminated"             = "grey90",
  "Moved out of NIH"       = "#666666"
)

plot_data_readable <- plot_data_real %>%
    rename(
    IC               = alluvium,
    Year             = x,
    Budget           = weight_real,
    IC_Year          = stratum,
    New_Institute = group
  )



myplot <- ggplot(plot_data_readable,
       aes(x = Year, stratum = IC_Year, alluvium = IC, y = Budget,
           fill = New_Institute, label=IC_Year)) +
  geom_flow(alpha = 0.5) +
  geom_stratum(width = 0.5, color = "grey30", alpha=0.8) +
  geom_text(
    data = plot_data_readable %>% filter(Year == "2026"),
    # aes(label = IC),
    stat = "stratum",
    size = 4,
    vjust = 0.5
  ) +
  scale_fill_manual(values = palette, na.value = "white") +
  labs(
    title    = "NIH Budget Alluvial: 2000→2010→2020→2024→2026",
    subtitle = "Flows from 27 old ICs into 8 new institutes (FY 2026 proposed)",
    x        = "Fiscal Year",
    y        = "Budget (Millions USD, inflation adjusted)",
    fill     = "2026 Institute"
  ) +
  # increase x- and y-axis label font sizes
  theme_minimal() +
  theme(axis.title.x = element_text(size = 18),
        axis.title.y = element_text(size = 18),
        axis.text.x  = element_text(size = 16),
        axis.text.y  = element_text(size = 16)) +
  theme(legend.position = "none")

myplot

ggsave("nih_budget_alluvial_adj.png", myplot, width = 10, height = 6, dpi = 300)

Raw Budget Numbers

Note that the x-axis is not linear, as it represents discrete fiscal years.

palette <- c(
  "NCI"                    = "#377EB8",
  "Body Systems"           = "#E41A1C",
  "Neuroscience & Brain"   = "#4DAF4A",
  "NIAID"                  = "#984EA3",
  "GMS"                    = "#FF7F00",
  "Child & Women’s Health" = "#A65628",
  "NIA"                    = "#F781BF",
  "Behavioral Health"      = "#999999",
  "Office of the Director" = "#8DD3C7",
  "Eliminated"             = "grey90",
  "Moved out of NIH"       = "#666666"
)

plot_data_readable <- plot_data_real %>%
    rename(
    IC               = alluvium,
    Year             = x,
    Budget           = weight,
    IC_Year          = stratum,
    New_Institute = group
  )



myplot <- ggplot(plot_data_readable,
       aes(x = Year, stratum = IC_Year, alluvium = IC, y = Budget,
           fill = New_Institute, label=IC_Year)) +
  geom_flow(alpha = 0.5) +
  geom_stratum(width = 0.5, color = "grey30", alpha=0.8) +
  geom_text(
    data = plot_data_readable %>% filter(Year == "2026"),
    # aes(label = IC),
    stat = "stratum",
    size = 4,
    vjust = 0.5
  ) +
  scale_fill_manual(values = palette, na.value = "white") +
  labs(
    title    = "NIH Budget Alluvial: 2000→2010→2020→2024→2026",
    subtitle = "Flows from 27 old ICs into 8 new institutes (FY 2026 proposed)",
    x        = "Fiscal Year",
    y        = "Budget (Millions USD, nominal)",
    fill     = "2026 Institute"
  ) +
  theme_minimal() +
  theme(axis.title.x = element_text(size = 18),
        axis.title.y = element_text(size = 18),
        axis.text.x  = element_text(size = 16),
        axis.text.y  = element_text(size = 16)) +
  theme(legend.position = "none")

myplot

ggsave("nih_budget_alluvial_raw.png", myplot, width = 10, height = 6, dpi = 300)

Data Table

Interactive tables

These tables shows the values in the plot above, with nominal and adjusted values in separate tabs. You can scroll, sort, and search using the table controls. The % Change column shows the percentage change in budget from FY 2024 to FY 2026 for each institute.

Inflation-adjusted Budget Numbers

library(tidyr)
library(DT)
library(RColorBrewer)

# 1) Pivot to wide, round, and compute pct_change
wide_data <- plot_data_real %>%
  select(alluvium, group, x, weight_real) %>%
  pivot_wider(names_from = x, values_from = weight_real) %>%
  arrange(group, alluvium) %>%
  mutate(across(where(is.numeric), ~ round(.))) %>%
  rename(
    IC                  = alluvium,
    `2026 Institute`    = group,
    FY2000              = `2000`,
    FY2010              = `2010`,
    FY2020              = `2020`,
    FY2024              = `2024`,
    FY2026              = `2026`
  ) %>%
  mutate(
    `% Change` = round((FY2026 - FY2024) / FY2024 * 100, 1)
  )

# 2) Prepare palettes with alpha = 0.6
inst_levels      <- sort(unique(wide_data$`2026 Institute`))
row_colors_alpha <- sapply(palette[inst_levels], function(hex) {
  rgb <- grDevices::col2rgb(hex)
  sprintf("rgba(%d,%d,%d,0.6)", rgb[1], rgb[2], rgb[3])
})
div_hex          <- brewer.pal(5, "RdYlGn")[c(1,5)]
div_colors_alpha <- sapply(div_hex, function(hex) {
  rgb <- grDevices::col2rgb(hex)
  sprintf("rgba(%d,%d,%d,0.6)", rgb[1], rgb[2], rgb[3])
})

# 3) Columns to shade by institute (everything except pct_change)
cols_institute <- setdiff(names(wide_data), "`% Change`")

# 4) Render the table
datatable(
  wide_data,
  rownames = FALSE,
  extensions = "Buttons",
  options = list(
    dom = 'Bfrtip',
    paging         = FALSE,
    scrollX        = TRUE,
    scrollY        = "400px",
    scrollCollapse = TRUE,
    buttons = c('copy', 'csv', 'excel')
  )
) %>%
  # A) Use the 2026 Institute *column* to color all the non–pct_change columns
  formatStyle(
    columns      = cols_institute,
    valueColumns = "2026 Institute",              # <-- lookups come from here
    backgroundColor = styleEqual(inst_levels, row_colors_alpha)
  ) %>%
  # B) Then style pct_change with diverging palette
  formatStyle(
    "% Change",
    backgroundColor = styleInterval(0, div_colors_alpha),
    color           = styleInterval(0, c("black","black"))
  )

Raw Budget Numbers

library(tidyr)
library(DT)
library(RColorBrewer)

# 1) Pivot to wide, round, and compute pct_change
wide_data <- plot_data_real %>%
  select(alluvium, group, x, weight) %>%
  pivot_wider(names_from = x, values_from = weight) %>%
  arrange(group, alluvium) %>%
  mutate(across(where(is.numeric), ~ round(.))) %>%
  rename(
    IC                  = alluvium,
    `2026 Institute`    = group,
    FY2000              = `2000`,
    FY2010              = `2010`,
    FY2020              = `2020`,
    FY2024              = `2024`,
    FY2026              = `2026`
  ) %>%
  mutate(
    `% Change` = round((FY2026 - FY2024) / FY2024 * 100, 1)
  )

# 2) Prepare palettes with alpha = 0.6
inst_levels      <- sort(unique(wide_data$`2026 Institute`))
row_colors_alpha <- sapply(palette[inst_levels], function(hex) {
  rgb <- grDevices::col2rgb(hex)
  sprintf("rgba(%d,%d,%d,0.6)", rgb[1], rgb[2], rgb[3])
})
div_hex          <- brewer.pal(5, "RdYlGn")[c(1,5)]
div_colors_alpha <- sapply(div_hex, function(hex) {
  rgb <- grDevices::col2rgb(hex)
  sprintf("rgba(%d,%d,%d,0.6)", rgb[1], rgb[2], rgb[3])
})

# 3) Columns to shade by institute (everything except pct_change)
cols_institute <- setdiff(names(wide_data), "`% Change`")

# 4) Render the table
datatable(
  wide_data,
  rownames = FALSE,
  extensions = "Buttons",
  options = list(
    dom = 'Bfrtip',
    paging         = FALSE,
    scrollX        = TRUE,
    scrollY        = "400px",
    scrollCollapse = TRUE,
    buttons = c('copy', 'csv', 'excel')
  )
) %>%
  # A) Use the 2026 Institute *column* to color all the non–pct_change columns
  formatStyle(
    columns      = cols_institute,
    valueColumns = "2026 Institute",              # <-- lookups come from here
    backgroundColor = styleEqual(inst_levels, row_colors_alpha)
  ) %>%
  # B) Then style pct_change with diverging palette
  formatStyle(
    "% Change",
    backgroundColor = styleInterval(0, div_colors_alpha),
    color           = styleInterval(0, c("black","black"))
  )
ggsave("nih_budget_alluvial.png", myplot, width = 10, height = 6, dpi = 300)

Interactive Plot

This is an attempt to make the plot interactive. Meh, it’s kinda crappy, but you can use the hover text to get identities and budgets for the smaller slices. Click and drag to zoom, triple click to zoom back out, and/or use the buttons on the top right corner.

library(plotly)
# Convert ggplot to plotly
ggplotly(myplot, tooltip=c("x","y","fill","stratum"))

R Session Info

sessionInfo()
## R version 4.4.1 (2024-06-14)
## Platform: aarch64-apple-darwin20
## Running under: macOS 15.5
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] plotly_4.10.4      RColorBrewer_1.1-3 metathis_1.1.4     DT_0.33           
## [5] ggalluvial_0.12.5  ggplot2_3.5.1      tidyr_1.3.1        dplyr_1.1.4       
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.9        utf8_1.2.4        generics_0.1.3    digest_0.6.37    
##  [5] magrittr_2.0.3    evaluate_0.24.0   grid_4.4.1        fastmap_1.2.0    
##  [9] jsonlite_1.8.8    httr_1.4.7        purrr_1.0.2       fansi_1.0.6      
## [13] viridisLite_0.4.2 crosstalk_1.2.1   scales_1.3.0      lazyeval_0.2.2   
## [17] textshaping_1.0.1 jquerylib_0.1.4   cli_3.6.3         rlang_1.1.4      
## [21] munsell_0.5.1     withr_3.0.1       cachem_1.1.0      yaml_2.3.10      
## [25] tools_4.4.1       colorspace_2.1-1  vctrs_0.6.5       R6_2.5.1         
## [29] lifecycle_1.0.4   htmlwidgets_1.6.4 ragg_1.4.0        pkgconfig_2.0.3  
## [33] pillar_1.9.0      bslib_0.8.0       gtable_0.3.5      glue_1.7.0       
## [37] data.table_1.16.0 systemfonts_1.2.3 xfun_0.47         tibble_3.2.1     
## [41] tidyselect_1.2.1  highr_0.11        rstudioapi_0.17.1 knitr_1.48       
## [45] farver_2.1.2      htmltools_0.5.8.1 rmarkdown_2.28    labeling_0.4.3   
## [49] compiler_4.4.1