Article

MML - Hurricane

As hurricanes grow stronger and more unpredictable, the race to predict them years in advance has never been more urgent. Traditional weather models excel at short-term forecasts but struggle with long-range predictions—until now. Our latest research combines AI, historical storm data, and even Wikipedia’s geographic insights to forecast hurricane risks up to five years ahead, with a 20-25% improvement over conventional methods. The goal? To give coastal cities, insurers, and disaster responders a crucial head start against the storms of tomorrow.

Published
Authors
Nicole Zhang & Zein Mukhanov
Read time
5 min read

How we're teaching machines to spot storm threats before they form.

Learn about tropical cyclones

Why This Matters: Hurricanes Are Getting Stronger

Since 1980, tropical cyclones have caused over $1.1 trillion in U.S. damages. They're now intensifying faster, tracking into new areas, and holding more rain — trends linked to warmer oceans and shifting wind patterns.

We're already seeing it this season:

Tropical Storm Erin is forecast to strengthen into the first hurricane of the 2025 Atlantic season — possibly a Category 3 by the weekend. Current projections place it between Bermuda and the U.S. East Coast, but forecasters warn that even small track changes could mean significant impacts.

(Source: ABC News, Aug 11, 2025)

Short-term tracking like this is strong — we can map Erin's likely corridor a week out — but multi-year forecasting remains a frontier problem. That's where multimodal AI comes in.

Our Approach: From Nowcasting to Long-Horizon Risk

Traditional hurricane prediction relies on numerical weather prediction (NWP) models — powerful physics-based simulations of atmosphere–ocean interactions. These excel at short-term and seasonal forecasts but quickly lose accuracy over years.

The multimodal ML framework in our recent study combines:

  • Structured data: Historical storm tracks, intensities, damages, sea surface temperature anomalies, wind shear, ENSO phase data (NOAA HURDAT2, IBTrACS, NOAA ERSST).
  • Unstructured text: Wikipedia "Geography" sections for coastal areas, encoding terrain, bay inlets, vegetation cover, and historical storm narratives.
  • Geospatial context: 1°×1° gridded global mapping for consistent spatial modeling.

How It Works: From Raw Data to Risk Maps

The pipeline from the new paper is:

1. Data preprocessing

  • Clean and align historical storm databases.
  • Extract environmental indicators (SST, wind shear, humidity, MSLP anomalies).
  • Gather relevant free-text geography descriptions from Wikipedia.

2. Feature engineering

  • Encode structured climate + storm history into numerical features.
  • Use DistilBERT to convert geographic text into embeddings.

3. Fusion modeling

  • Concatenate numerical + text embeddings.
  • Train an XGBoost model for classification (probability of major hurricane in next 1–5 years).

4. Evaluation

  • ROCAUC improvements of 20–25% over single-modality baselines.
  • Best performance for 1–2 year horizons; signal degrades but remains above baseline even at 5 years.
Hurricane Hurricane

(Boussioux, L., Zeng, C., Guénais, T., & Bertsimas, D. (2022). Hurricane Forecasting: A Novel Multimodal Machine Learning Framework. Weather and Forecasting, 37(6), 817-831. https://doi.org/10.1175/WAF-D-21-0091.1)

We slice time into 8 frames — each frame is a snapshot of the storm's environment (winds, pressure, humidity, etc.) from reanalysis maps.

A Convolutional Neural Network (CNN) looks at each frame like a pair of eyes scanning a photo, turning it into a list of important numbers — a one-dimensional embedding.

We add the storm's diary — at each time step, we attach statistical features like current wind speed, central pressure, and ocean heat.

The Transformer takes over — it's like the AI's memory, connecting what happened in frame 1 to frame 8, spotting patterns in how storms evolve.

We average ("pool") all frames to get the AI's overall sense of the storm's future.

One last fully connected layer turns that understanding into predictions — either:

  • Track: Where will the storm be in 24 hours?
  • Intensity: How strong will it be?

Friendly Analogy

Imagine you're watching home videos of a child learning to walk. You don't just look at the last frame — you watch the whole sequence, remember the wobbles, the pauses, and the bursts of speed. By the end, you can guess pretty well when they'll take their next step.

Our AI does the same thing for storms — watching how they change over 8 "frames" and using both the visuals (maps) and the diary entries (stats) to guess where they'll be and how strong they'll get.

Real-World Tech: NASA's Hurricane Intensity Estimator

Short-term nowcasting can inform long-term models. NASA's IMPACT Hurricane Intensity Estimator uses GOES satellite infrared imagery to predict wind speeds every 5–15 minutes.

In Hurricane Ida (2021), it nailed the peak wind speed at 129 knots vs. NHC's 130 — a rare match. While designed for real-time monitoring, its data outputs (storm intensification curves) could enrich the historical datasets used in multi-year forecasting.

What We Found: Better Together

From the new paper + prior work:

Years Ahead Statistical Only ROCAUC Multimodal ROCAUC Δ
1 Year 0.55 0.76 +0.21
2 Years 0.54 0.75 +0.21
5 Years 0.53 0.74 +0.21

Why This Is a Big Deal

  • Urban planning — Identify surge-prone corridors decades ahead.
  • Insurance & reinsurance — Calibrate catastrophe models with more varied signals.
  • Sovereign risk — Anticipate debt stress from future coastal disasters.

💡 Concept You Should Understand

Climate Oscillation Memory: Multi-year hurricane risk is shaped by lagged signals like El Niño, La Niña, and the Atlantic Multidecadal Oscillation. Including time-lagged environmental variables in the model helps it "remember" climate cycles.

Cool Stuff & Take Action

  • Explore: NOAA's HURDAT2 dataset and NASA's IMPACT portal.
  • Build: Join NASA's Space Apps Challenge or Climate Change AI projects.
  • Think Bigger: Erin's track is 7 days out — imagine mapping likely storm corridors for 2030.

"We can't calm the storm, but we can teach ourselves to see it coming, and use every warning as a chance to save lives, protect homes, and safeguard our future."

– MML Lab Members
Hurricane PredictionMachine LearningClimate Change