Methods: Cross-Agency Analysis
Overview
This analysis compares environmental health research funding across three federal agencies: NIH, EPA, and NSF. The key finding: environmental health research falls through the cracks between agencies, with each focusing on different aspects of the exposure-to-disease pathway.
The 3-agency Sankey diagram on the Funding Mismatch page visualizes how funding flows from agencies to research categories.
Data Sources
| Agency | Data Source | Time Period | Total Grants |
|---|---|---|---|
| NIH | NIH RePORTER (web interface + API) | FY2022–2025 | ~325,000 |
| EPA | USAspending.gov APIapi.usaspending.gov/api/v2/search/spending_by_award/ |
FY2022–2025 | 15,147 |
| NSF | NSF Award Search APIapi.nsf.gov/services/v1/awards.json |
FY2022–2025 | 27,362 |
Classification Categories
For cross-agency comparison, grants were classified into four categories:
| Category | Description | Examples |
|---|---|---|
| Environmental Health (ENV_HEALTH) |
Environmental exposure research with explicit health outcomes—epidemiology, toxicology, exposure assessment, health risk studies | PFAS exposure and liver disease; air pollution and childhood asthma cohort study |
| Environmental (Non-Health) (ENV_NOHEALTH) |
Environmental monitoring, remediation, and protection without direct health research | PM2.5 monitoring networks; brownfields cleanup; water quality testing |
| Mechanisms (MECH) |
Biological mechanisms research at molecular/cellular level (NSF only; EPA does not fund this type) | Protein signaling pathways; gene expression; cellular metabolism |
| Other (OTHER) |
Infrastructure, other research fields, training, administrative | Equipment grants; conference support; training fellowships |
NSF Grant Classification
Dataset
27,362 unique NSF grants (FY2022–2025) retrieved from the NSF Awards API. Grants were filtered to include only research awards (excluding fellowships, planning grants, and conference support where possible).
Classification Approach
NSF grants were classified using a hybrid keyword + directorate-based approach:
| Step | Method | Details |
|---|---|---|
| 1 | Keyword pattern matching |
Environmental keywords: PFAS, microplastics, pesticide, toxicology, remediation, water treatment, contamination, pollutant, exposure assessment Mechanisms keywords: cell, protein, gene, enzyme, pathway, signaling, biochemistry, molecular |
| 2 | Directorate-based rules | BIO directorate grants defaulted to Mechanisms; GEO/ENG grants with contamination keywords assigned to Environmental |
| 3 | LLM classification (tested) | Claude Haiku was tested but proved unreliable—only 16% of Haiku's "Environmental" classifications contained actual exposure terms. LLM was used only for edge cases after keyword filtering. |
| 4 | Validation pass | All classifications verified against keyword presence to remove false positives |
NSF Classification Results
| Category | Grant Count | Percentage |
|---|---|---|
| Environmental (total) | 619 | 2.3% |
| → Environmental Health | 281 | 1.0% |
| → Environmental (Non-Health) | 338 | 1.2% |
| Mechanisms | 2,193 | 8.0% |
| Other | 24,550 | 89.7% |
EPA Grant Classification
Dataset
15,147 EPA grants (FY2022–2025) retrieved from USAspending.gov API. Filtered to Award Types 02, 03, 04, 05 (grants and cooperative agreements, excluding contracts).
Classification Approach
EPA grants were classified using keyword patterns and CFDA program codes:
| Step | Method | Details |
|---|---|---|
| 1 | Health keyword matching | "health study", "epidemiology", "toxicology", "exposure assessment", "biomarker", "health risk assessment", "dose-response" |
| 2 | Environmental keyword matching | "monitoring network", "remediation", "response program", "cleanup", "contamination", "Superfund", "brownfields" |
| 3 | Infrastructure exclusion | "construction", "capital improvement", "equipment purchase", "state revolving fund" |
| 4 | CFDA code validation | CFDA 66.034 (Surveys, Studies, Research) has highest Environmental Health concentration (47%)—used as validation baseline |
EPA Classification Results
| Category | Grant Count | Percentage |
|---|---|---|
| Environmental (total) | 7,752 | 51.2% |
| → Environmental Health | 1,789 | 11.8% |
| → Environmental (Non-Health) | 5,963 | 39.4% |
| Mechanisms | 0 | 0.0% |
| Other | 7,395 | 48.8% |
Key Insight: While EPA appears to have high environmental funding (51%), most of this is operational—monitoring networks, remediation, and protection activities. Only 12% conducts actual health research studying how exposures affect human health.
Cross-Agency Comparison
| Agency | Total Grants | Environmental Health | Environmental (Non-Health) | Mechanisms |
|---|---|---|---|---|
| NIH | ~500,000 | ~0.6% | N/A | ~47% |
| EPA | 15,147 | 11.8% | 39.4% | 0% |
| NSF | 27,362 | 1.0% | 1.2% | 8.0% |
The "Missing Middle": Exposure-to-Mechanism Research
Only 19 grants across NSF and EPA study how environmental exposures cause harm at the molecular level—the crucial link between identifying exposures and developing therapeutics.
Why This Gap Exists
| Agency | Primary Focus | What's Missing |
|---|---|---|
| EPA | Environmental protection and monitoring | Molecular mechanisms of how exposures cause disease |
| NIH | Disease mechanisms and therapeutics | Environmental exposures as initiating factors |
| NSF | Basic science discovery | Health applications and translational focus |
The result: We identify exposures (EPA), we understand disease mechanisms (NIH), but we don't connect the two. This represents the "exposure → mechanism → therapeutic" pipeline gap.
Limitations
- Classification system differences: The NIH disease analysis (on the other Methods page) uses a different 4-category system optimized for disease-specific breakdowns. Cross-agency analysis uses Environmental Health / Environmental (Non-Health) / Mechanisms / Other for comparability.
- EPA operational vs. research ambiguity: Distinguishing monitoring activities from health research is sometimes unclear from grant descriptions alone.
- NSF keyword reliability: Initial LLM classification was unreliable; the hybrid approach improved accuracy but may still miss grants using non-standard terminology.
- Time period: Cross-agency analysis uses FY2022–2025; results may vary by individual fiscal year.
- Funding amounts: This analysis counts grants, not dollars. EPA infrastructure grants can be significantly larger than research grants.
Data Files
Full methodology documentation is maintained in federal_grants/:
epa/epa_analysis_summary.md— EPA classification methodology and resultsnsf/nsf_analysis_summary.md— NSF classification methodology and resultsagency_comparison/README.md— Cross-agency comparison methodology