Methods: Cross-Agency Analysis

Methodology for comparing environmental health research funding across NIH, EPA, and NSF

Overview

This analysis compares environmental health research funding across three federal agencies: NIH, EPA, and NSF. The key finding: environmental health research falls through the cracks between agencies, with each focusing on different aspects of the exposure-to-disease pathway.

The 3-agency Sankey diagram on the Funding Mismatch page visualizes how funding flows from agencies to research categories.

Data Sources

Agency Data Source Time Period Total Grants
NIH NIH RePORTER (web interface + API) FY2022–2025 ~325,000
EPA USAspending.gov API
api.usaspending.gov/api/v2/search/spending_by_award/
FY2022–2025 15,147
NSF NSF Award Search API
api.nsf.gov/services/v1/awards.json
FY2022–2025 27,362

Classification Categories

For cross-agency comparison, grants were classified into four categories:

Category Description Examples
Environmental Health
(ENV_HEALTH)
Environmental exposure research with explicit health outcomes—epidemiology, toxicology, exposure assessment, health risk studies PFAS exposure and liver disease; air pollution and childhood asthma cohort study
Environmental (Non-Health)
(ENV_NOHEALTH)
Environmental monitoring, remediation, and protection without direct health research PM2.5 monitoring networks; brownfields cleanup; water quality testing
Mechanisms
(MECH)
Biological mechanisms research at molecular/cellular level (NSF only; EPA does not fund this type) Protein signaling pathways; gene expression; cellular metabolism
Other
(OTHER)
Infrastructure, other research fields, training, administrative Equipment grants; conference support; training fellowships

NSF Grant Classification

Dataset

27,362 unique NSF grants (FY2022–2025) retrieved from the NSF Awards API. Grants were filtered to include only research awards (excluding fellowships, planning grants, and conference support where possible).

Classification Approach

NSF grants were classified using a hybrid keyword + directorate-based approach:

Step Method Details
1 Keyword pattern matching Environmental keywords: PFAS, microplastics, pesticide, toxicology, remediation, water treatment, contamination, pollutant, exposure assessment
Mechanisms keywords: cell, protein, gene, enzyme, pathway, signaling, biochemistry, molecular
2 Directorate-based rules BIO directorate grants defaulted to Mechanisms; GEO/ENG grants with contamination keywords assigned to Environmental
3 LLM classification (tested) Claude Haiku was tested but proved unreliable—only 16% of Haiku's "Environmental" classifications contained actual exposure terms. LLM was used only for edge cases after keyword filtering.
4 Validation pass All classifications verified against keyword presence to remove false positives

NSF Classification Results

Category Grant Count Percentage
Environmental (total) 619 2.3%
   → Environmental Health 281 1.0%
   → Environmental (Non-Health) 338 1.2%
Mechanisms 2,193 8.0%
Other 24,550 89.7%

EPA Grant Classification

Dataset

15,147 EPA grants (FY2022–2025) retrieved from USAspending.gov API. Filtered to Award Types 02, 03, 04, 05 (grants and cooperative agreements, excluding contracts).

Classification Approach

EPA grants were classified using keyword patterns and CFDA program codes:

Step Method Details
1 Health keyword matching "health study", "epidemiology", "toxicology", "exposure assessment", "biomarker", "health risk assessment", "dose-response"
2 Environmental keyword matching "monitoring network", "remediation", "response program", "cleanup", "contamination", "Superfund", "brownfields"
3 Infrastructure exclusion "construction", "capital improvement", "equipment purchase", "state revolving fund"
4 CFDA code validation CFDA 66.034 (Surveys, Studies, Research) has highest Environmental Health concentration (47%)—used as validation baseline

EPA Classification Results

Category Grant Count Percentage
Environmental (total) 7,752 51.2%
   → Environmental Health 1,789 11.8%
   → Environmental (Non-Health) 5,963 39.4%
Mechanisms 0 0.0%
Other 7,395 48.8%
Key Insight: While EPA appears to have high environmental funding (51%), most of this is operational—monitoring networks, remediation, and protection activities. Only 12% conducts actual health research studying how exposures affect human health.

Cross-Agency Comparison

Agency Total Grants Environmental Health Environmental (Non-Health) Mechanisms
NIH ~500,000 ~0.6% N/A ~47%
EPA 15,147 11.8% 39.4% 0%
NSF 27,362 1.0% 1.2% 8.0%

The "Missing Middle": Exposure-to-Mechanism Research

Only 19 grants across NSF and EPA study how environmental exposures cause harm at the molecular level—the crucial link between identifying exposures and developing therapeutics.

Why This Gap Exists

Agency Primary Focus What's Missing
EPA Environmental protection and monitoring Molecular mechanisms of how exposures cause disease
NIH Disease mechanisms and therapeutics Environmental exposures as initiating factors
NSF Basic science discovery Health applications and translational focus

The result: We identify exposures (EPA), we understand disease mechanisms (NIH), but we don't connect the two. This represents the "exposure → mechanism → therapeutic" pipeline gap.


Limitations

  • Classification system differences: The NIH disease analysis (on the other Methods page) uses a different 4-category system optimized for disease-specific breakdowns. Cross-agency analysis uses Environmental Health / Environmental (Non-Health) / Mechanisms / Other for comparability.
  • EPA operational vs. research ambiguity: Distinguishing monitoring activities from health research is sometimes unclear from grant descriptions alone.
  • NSF keyword reliability: Initial LLM classification was unreliable; the hybrid approach improved accuracy but may still miss grants using non-standard terminology.
  • Time period: Cross-agency analysis uses FY2022–2025; results may vary by individual fiscal year.
  • Funding amounts: This analysis counts grants, not dollars. EPA infrastructure grants can be significantly larger than research grants.

Data Files

Full methodology documentation is maintained in federal_grants/:

  • epa/epa_analysis_summary.md — EPA classification methodology and results
  • nsf/nsf_analysis_summary.md — NSF classification methodology and results
  • agency_comparison/README.md — Cross-agency comparison methodology