How the hantavirus risk index is computed
The Hantavirus Tracker risk index is a composite, per-region score from 0 to 100. It compresses four signals — news mention density, verified outbreak reports, source authority, and recency — into a single number designed to rank regions by current activity, not to predict case rates.
Formula
score = 100 * (
0.40 * normalize(news_mentions_30d)
+ 0.30 * normalize(confirmed_cases_recent)
+ 0.20 * source_authority_weight
+ 0.10 * recency_decay
)Inputs
- news_mentions_30d — count of distinct news items tagged to the region in the last 30 days, sourced from GDELT and Google News, deduplicated by URL hash. Normalized against the global maximum for the same window so the busiest region scores 1.0.
- confirmed_cases_recent — sum of case counts from confirmed outbreak events in the last 90 days. Sources include WHO DON, CDC HPS surveillance, PAHO bulletins, and ProMED-mail. Normalized against the global maximum for the same window.
- source_authority_weight — average authority weight of all signals contributing to a region. WHO and CDC are 1.0, PAHO 0.9, ProMED 0.8, GDELT and Google News 0.5. A region driven entirely by authoritative reports scores higher than one driven only by media chatter.
- recency_decay — a linear decay between the current moment and the latest event reported in the region: full weight at the moment of the event, zero at 90 days. Keeps stale regions from anchoring the index.
Buckets
| Score | Bucket | Reading |
|---|---|---|
| 0–19 | Low | No active signal in the window |
| 20–39 | Moderate | Background mentions; nothing escalated |
| 40–59 | Elevated | Increased mentions or one-off authoritative report |
| 60–79 | High | Active outbreak signal; multiple confirmed cases |
| 80–100 | Severe | WHO/CDC alert; sustained high-volume reporting |
Update cadence
- News + outbreak feeds: every 15 minutes (GDELT, Google News, WHO, ProMED).
- Per-region risk score: recomputed every hour.
- CDC HPS state-level surveillance: refreshed when CDC publishes its annual cumulative table.
- Premium alert dispatch: hourly check against subscriber preferences.
Limitations
- Hantavirus surveillance is not real-time globally. Many countries report case counts annually or only when an outbreak is confirmed.
- News mention density is a proxy for attention, not for incidence. Major non-outbreak news (e.g. a new vaccine paper) can elevate scores without reflecting actual case activity.
- Country-level resolution is the default; sub-national resolution is available only for the United States via CDC HPS data.
- Person-to-person transmission is not modeled — the index treats all signals as exposure-event reports.
Sources
- WHO Disease Outbreak News
- CDC Hantavirus Surveillance
- Pan American Health Organization (PAHO)
- ProMED-mail
- GDELT 2.0 DOC API
- Google News RSS (multi-language)
FAQ
Is the risk index a case-rate forecast?
No. The risk index is a composite signal of attention and reported activity, not a clinical forecast. A high score means surveillance bodies and news sources are reporting more activity in that region; it does not mean a defined number of cases per population.
How often does the score update?
News and outbreak feeds refresh every 15 minutes via Vercel Cron. The composite per-region score is recomputed hourly. CDC HPS surveillance data is updated when CDC publishes its annual cumulative counts.
Why are some news sources weighted lower than WHO or CDC?
Authority weight reflects how much of a signal each source independently confirms. WHO Disease Outbreak News and CDC surveillance have the highest weight (1.0) because their releases require official verification. ProMED-mail (0.8) and PAHO (0.9) are next. GDELT and Google News (0.5) capture broader signal, including unconfirmed media chatter, so they contribute less per item.
What does recency decay mean?
Recency decay linearly down-weights regions whose latest reported event is older. A signal from yesterday counts at full strength; a signal from 90 days ago contributes essentially nothing. This keeps the index responsive without amplifying years-old reports.
Can I download the underlying data?
Yes — the public JSON API at /api/v1/regions, /api/v1/news, and /api/v1/events returns the same data the site uses. Attribution to Hantavirus Tracker (with a link) is required. See /api for usage.
Hantavirus Tracker. Methodology: composite regional risk index. https://hantapulse.org/methodology