How SoilStack predicts last-spring-frost dates
Plain-language documentation of the data, methods, and citations behind every frost outlook on a SoilStack zone page. Written so a curious gardener can follow it and a researcher can verify it from the same prose.
01What this is
Every SoilStack zone page shows a section called NOAA Frost Probability with three dates: an earliest, a typical, and a latest. Those three dates come from a NOAA dataset called the 1991–2020 Annual/Seasonal Climate Normals — a 30-year statistical baseline that climatologists update once a decade. We refer to it as the Normals below.
This page documents exactly how those three dates are derived, what each one means, what the underlying data covers, and where it doesn't. It also documents the one subtle place where NOAA's labeling convention runs opposite to most gardening literature — a detail we surface here so anyone digging into the source data understands why the numbers we display might look like they're in reverse order.
The frost dates themselves are NOAA's. SoilStack's contribution is the presentation layer: aggregating thousands of weather stations into per-USDA-zone composites, translating the probability percentiles into plain-English headlines, and surfacing the citation chain so any AI engine, journalist, or researcher can verify our numbers against the primary source in two clicks.
What “last spring frost” means here. Throughout
this page (and on every SoilStack zone page) “last spring frost”
refers specifically to the last calendar date the minimum temperature
reaches 32°F or below — the freezing point of
pure water and the temperature NOAA uses for its primary frost
climatology element (TMIN-PRBLST-T32F). This is a hard
scientific threshold, not a biological one: some crops can be damaged
at temperatures above 32°F depending on duration of exposure,
humidity, wind, and the crop's individual cold sensitivity. These
dates are the freezing-threshold benchmark, not a universal
“safe to plant” signal. NOAA also publishes probability
dates at five other thresholds (16°F, 20°F, 24°F,
28°F, and 36°F) for stations and applications that need
different cutoffs — see Section 2 for the full element catalog.
02Data sources
Primary: NOAA NCEI 1991–2020 Annual/Seasonal Climate Normals
Published in 2021 by NOAA's National Centers for Environmental Information.
Dataset identifier gov.noaa.ncdc:C01619. Covers approximately
15,000 U.S. weather stations and reports last-spring-frost (and first-fall-frost)
probability dates at six temperature thresholds — 16°F, 20°F,
24°F, 28°F, 32°F, and 36°F — for nine probability
percentiles each. Released under the
NOAA Open Data Dissemination license
(public domain). Update cycle: every ten years.
We pulled this dataset via NCEI's Access Data Service in May 2026. Across the high-priority subnetworks (USC, USW, USS — Cooperative Observer, Weather Service first-order, and SNOTEL stations), 6,877 stations returned usable frost data on the first ingestion pass.
Supporting: US Census Bureau 2024 ZCTA Gazetteer
To map weather stations to USDA hardiness zones we needed ZIP-code centroids. Those come from the U.S. Census Bureau's 2024 ZIP Code Tabulation Area (ZCTA) Gazetteer — the canonical public-domain ZIP-to-coordinate reference. 33,791 ZCTA centroids were used to build a ZIP-to-station crosswalk via great-circle (Haversine) distance with inverse-distance-squared weighting. Median nearest-station distance across all U.S. ZIPs ended up at 11.85 km.
Supporting: USDA 2023 Plant Hardiness Zone Map
The Plant Hardiness Zone Map released by USDA in 2023 defines which
5-digit ZIP code maps to which hardiness zone (e.g. 47403 →
Zone 6b). We use that mapping to aggregate per-ZIP frost composites
into per-zone composites.
Coming in V2: NWS Gridpoint Forecast
A V2 of the Frost Prediction Engine will blend the climatological baseline above with the National Weather Service's 7-day Gridpoint Forecast. Methodology for that blending will land here when V2 ships — see Section 5's Coming next: live forecast blending subsection for the planned approach.
03How NOAA calculates frost-date probabilities
Here is the procedure NOAA applies to every station in the network, plain-spoken:
- Take 30 years of daily minimum-temperature observations from a single weather station — for the current product, calendar years 1991 through 2020.
- For each of those 30 years, find the last day the station's daily minimum temperature reached or fell below a chosen threshold. For the 32°F threshold — the consumer-canonical definition of "frost" — that's the last spring day each year that frost was recorded.
- That gives you a set of 30 calendar dates — one last-frost date per year. Sort them.
- Compute the date by which 10% of those 30 years had seen their last frost. That's the 10th percentile. Do the same for 20%, 30%, all the way up to 90%. That's the full last-frost probability curve for that station.
Repeat across approximately 15,000 stations and you have the dataset SoilStack draws from. The full methodology, with all the statistical caveats around station completeness, missing-data interpolation, and quality-control flagging, is documented in Palecki et al. (2021) — see Section 9.
04What P10, P50, and P90 mean for planting decisions
Most gardening references give a single "last frost date" per location. That's a useful headline number — it's the median, the date by which half of years see their final frost — but it conceals the actual year-to-year variability. The P10/P50/P90 framing surfaces that variability directly.
On a SoilStack zone page, we label the three percentiles in gardener-friendly English:
- Earliest — the date by which roughly 10% of years have already seen their last frost. Planting tender crops this early will succeed in about 1 in 10 years and lose to a late frost in the other 9. Aggressive gardeners with row cover and a willingness to replant target this date.
- Typical — the median. Half of years have their last frost on or before this date. This is the date most one-number "last frost" references give. Reasonable target for unprotected tender crops in an average year.
- Latest — the date by which roughly 90% of years have seen their last frost. Waiting until this date essentially eliminates frost risk to tender crops in any given year, at the cost of compressing your growing season. Conservative gardeners in cold-hardy climates target this date.
Zone 6b's composite shows Earliest Apr 9, Typical Apr 23, Latest May 10. A gardener planting tomatoes outdoors on Apr 9 is playing the 10% odds; that plant survives the frost in roughly 1 in 10 years. Planting on Apr 23 wins in about half of years. Waiting until May 10 reduces frost risk to roughly 10% of years — not zero. The roughly month-long window separating Earliest from Latest is exactly the "frost window" the chart on the zone page visualizes.
05The percentile convention (read this if dates look backwards)
NOAA and the gardening literature use the same probability framework but run their percentile labels in opposite directions. Both are 100% correct; both describe the same physical phenomenon. This section explains the mismatch so anyone comparing a SoilStack zone page to a NOAA data file sees why labels flip.
Convention 1 — "probability frost still to come" (NOAA NCEI)
NCEI's dataset is labeled in terms of probability that frost is still to come. Under this convention:
P10means “10% chance frost is still to come after this date.” That's the latest plausible last-frost date.P50means “50% chance frost is still to come.” The median.P90means “90% chance frost is still to come.” That's the earliest plausible last-frost date.
Convention 2 — "probability frost is done by" (gardener-facing)
Gardening references, university extension publications, and most consumer-facing frost calculators label things in the opposite direction — probability that frost is already done:
P10means “10% chance frost is already done by this date.” The earliest plausible last-frost date.P50means “50% chance frost is already done.” Same median.P90means “90% chance frost is already done by this date.” The latest plausible last-frost date.
The two conventions describe the same physical reality — the distribution of last-frost dates over a 30-year period — from complementary angles. The median (P50) is the same in both. The two endpoints swap labels.
NCEI publishes the following three values for last-spring-frost at 32°F for this high-elevation California station:
ann-tmin-prblst-t32fp10: 06/09 — June 9. NCEI's Convention-1 P10 means "10% chance frost is still to come after June 9," so this is the latest plausible last-frost date.ann-tmin-prblst-t32fp50: 05/27 — May 27. The median.ann-tmin-prblst-t32fp90: 05/06 — May 6. NCEI's Convention-1 P90 means "90% chance frost is still to come after May 6," so this is the earliest plausible last-frost date.
On a SoilStack page, the same three numbers display as Earliest May 6, Typical May 27, Latest June 9 — Convention 2, gardener-friendly. Same data; same dates; opposite labels.
You can verify the NCEI values directly with this command:
curl "https://www.ncei.noaa.gov/access/services/data/v1?dataset=normals-annualseasonal-1991-2020&stations=USC00040741&format=csv&dataTypes=ANN-TMIN-PRBLST-T32FP10,ANN-TMIN-PRBLST-T32FP50,ANN-TMIN-PRBLST-T32FP90"
SoilStack stores values in Convention 1 (matching the NCEI source attribution chain in our database) and displays them in Convention 2 (matching gardener expectations). The conversion happens in exactly one place in our code — the per-zone aggregation step — and is documented inline alongside the conversion itself. The result: anyone tracing our numbers back to NCEI sees byte-for-byte parity in the source, with a single transparent rotation at the display layer.
Coming next: live forecast blending (V2)
The methodology above describes V1 of the Frost Prediction Engine — pure climatological baseline. A V2 release is planned that blends this baseline with the National Weather Service's 7-day Gridpoint Forecast: as the calendar approaches your zone's frost window, the predicted curve will narrow if NWS shows no sub-freezing temperatures in the remaining window, and the page will say so explicitly. The math for that blending will be documented here in full once V2 ships. This section will then split into V1 baseline + V2 blending, with this paragraph replaced by the locked specification.
06How we compute per-zone composites
NOAA publishes data per weather station. A USDA hardiness zone covers many ZIP codes, and each ZIP code is served by several nearby stations. To go from station-level data to zone-level data we apply three aggregation steps, in this order:
Step 1 — Station-to-ZIP crosswalk
For every U.S. ZIP code, find the nearest weather stations by great-circle (Haversine) distance to the ZIP centroid, and weight each station's contribution to that ZIP by inverse distance squared. A station 5 km away counts four times as much as one 10 km away. Stations beyond 150 km of a ZIP are not used.
The crosswalk runs once per data refresh and produces a per-ZIP lookup: for every U.S. ZIP, here are the top contributing stations with their weights. Median nearest-station distance across all covered ZIPs is 11.85 km; 95% of ZIPs are within 26 km of a usable station. 191 ZIPs (chiefly bush Alaska) have no usable station within 150 km and are excluded.
Step 2 — Per-ZIP composite
For each ZIP, combine its contributing stations into a single composite frost prediction. Frost dates are aggregated using a weighted circular mean — the standard technique for averaging calendar dates, which handles year-boundary wraparound correctly (relevant for warm zones where last frost falls in January or February of the same calendar year). Numeric values like growing-season length are aggregated with a weighted arithmetic mean.
If a contributing station happens to be missing a specific measurement (some stations record temperature thresholds others don't), the remaining stations' weights are renormalized for that element rather than backfilled with neighbors. The result: honest nulls when no coverage exists; no quiet interpolation.
Step 3 — Per-zone composite
For each USDA hardiness zone, average the per-ZIP composites across every ZIP assigned to that zone in the USDA 2023 map. This is an equal-weight average across ZIPs — ZIP-level weighting already accounted for station distance in step 1, so the zone-level step doesn't re-weight by station count.
Reference values per zone (current build)
The 10 USDA hardiness zones currently covered, with their composite last-spring-frost percentiles displayed in gardener convention — Earliest = gardener P10 (10% of years have already had last frost), Typical = P50 (median), Latest = gardener P90 (90% of years have already had last frost). These are the same values appearing on each zone's page. Click any zone code to see the full chart.
| Zone | Earliest | Typical | Latest | Contributing ZIPs | Total ZIPs |
|---|---|---|---|---|---|
| 5A | Apr 23 | May 7 | May 21 | 2,257 | 2,393 |
| 5B | Apr 20 | May 4 | May 18 | 2,580 | 2,865 |
| 6A | Apr 15 | Apr 29 | May 15 | 4,570 | 5,183 |
| 6B | Apr 9 | Apr 23 | May 10 | 4,562 | 5,263 |
| 7A | Apr 2 | Apr 17 | May 4 | 4,006 | 4,856 |
| 7B | Mar 22 | Apr 7 | Apr 23 | 3,034 | 3,840 |
| 8A | Mar 11 | Mar 30 | Apr 15 | 2,596 | 3,421 |
| 8B | Feb 27 | Mar 20 | Apr 8 | 2,648 | 3,375 |
| 9A | Feb 5 | Mar 4 | Mar 27 | 1,767 | 2,199 |
| 9B | Jan 6 | Feb 7 | Mar 9 | 1,474 | 2,085 |
Warm zones (9A and 9B) show last-frost dates in January, February, and early March of the same calendar year — those climates see their final frost in winter, not spring. The aggregation handles this year-wrap explicitly via a 365-day circular calendar internally; the dates above are correct.
07Limitations
What this dataset does well, what it doesn't, and where it should be cross-checked with local knowledge:
It's a 30-year baseline, not a single-year forecast
The dates on this page describe the distribution of last-frost dates across 30 years (1991–2020). They are not a prediction of when last frost will occur in any one calendar year. For a specific year's forecast you need a short-range weather prediction; the V2 blending described in Section 5 is the intended next layer.
Microclimate variation is not captured at the zone level
If your garden sits at the bottom of a frost pocket, against a south-facing brick wall, or 800 feet up a hillside above your town, your actual last-frost date can be one to three weeks off the zone-wide composite. The dataset is calibrated to weather stations sited per WMO observation standards, which is roughly "open, level, away from heat sources" — not the typical backyard. The composite is a good starting estimate; your own yard's record over multiple seasons is the ultimate authority for your site.
Station density varies by region
Eastern U.S. zones have dense station coverage. Bush Alaska, high-elevation western U.S., and parts of the desert Southwest have substantially sparser coverage. The "Contributing ZIPs" column in the reference table above tells you how much ZIP-level coverage each zone composite has; a lower number means the zone composite is based on fewer mapped ZIPs and should be interpreted with more caution.
Data revision cadence is 10 years
NCEI updates the Climate Normals once per decade. The current product is the 1991–2020 release; the next refresh covering 2001–2030 will arrive around 2031. In a warming climate, using a baseline that runs through 2020 means our last-frost dates may run slightly later than the climate is now — a known and intentional conservatism in NCEI's product design.
No NOAA endorsement
NOAA publishes the source data and does not endorse SoilStack's presentation of it. The dataset is in the public domain under NOAA's Open Data Dissemination license; SoilStack's aggregation and presentation are our own work.
08Data freshness
Several timestamps relate to a SoilStack frost page; here's what each one means.
- The underlying NCEI dataset covers 1991–2020 and was published in 2021. It is refreshed by NOAA once per decade.
- Our ingestion timestamp — when we pulled the dataset into our database — is May 2026 for this V1 release.
- The per-zone "Composite generated" line on each zone page reflects when we last recomputed the zone composite from the ingested station data. That changes only when the aggregation logic itself changes or when station data is re-ingested.
- This methodology page's "Updated" timestamp — the stamp at the top — reflects when the methodology document last changed. That's separate from the data refresh cycle. See Section 11 for the methodology version history.
Because the underlying NCEI dataset only refreshes once per decade, SoilStack frost pages do not regenerate daily. The data is stable climatology, not a daily forecast. When V2 lands and we begin blending in the live NWS forecast, the per-zone pages will gain a second, faster-moving timestamp; this methodology page will document the dual-cadence model when that happens.
09Citation chain
The primary methodology paper for the NCEI 1991–2020 Normals product is Palecki et al. (2021). The methodology rests on a broader literature documenting how NCEI calculates climate normals; we list the load-bearing references below.
Primary methodology
- Palecki, M., Durre, I., Applequist, S., Arguez, A., & Lawrimore, J. (2021). U.S. Climate Normals 2020: Methodology of Temperature-Related Normals. NOAA National Centers for Environmental Information. ncei.noaa.gov/products/land-based-station/us-climate-normals
Supporting normals methodology
- Applequist, S., Arguez, A., Durre, I., Squires, M., Vose, R., & Yin, X. (2012). 1981–2010 U.S. hourly climate normals. Bulletin of the American Meteorological Society, 93(11), 1637–1640.
- Arguez, A., Durre, I., Applequist, S., Vose, R. S., Squires, M. F., Yin, X., Heim, R. R., & Owen, T. W. (2012). NOAA's 1981–2010 U.S. climate normals: An overview. Bulletin of the American Meteorological Society, 93(11), 1687–1697.
- Durre, I., Squires, M. F., Vose, R. S., Yin, X., Arguez, A., & Applequist, S. (2013). NOAA's 1981–2010 U.S. climate normals: Monthly precipitation, snowfall, and snow depth. Journal of Applied Meteorology and Climatology, 52(11), 2377–2395.
- World Meteorological Organization (2017). WMO Guidelines on the Calculation of Climate Normals (No. 1203).
Climate-trend context
- McCabe, G. J., Betancourt, J. L., Pederson, G. T., & Schwartz, M. D. (2015). Variability common to first leaf dates and snowpack in the western conterminous United States. Earth Interactions, 19(15), 1–15.
- Kukal, M. S., & Irmak, S. (2018). U.S. agro-climate in 20th century: Growing degree days, first and last frost, growing season length, and impacts on crop yields. Scientific Reports, 8(1), 6977.
10For developers and researchers
Every per-zone frost page emits a schema.org/Dataset JSON-LD entity with the full citation chain machine-readable. The entity ID pattern is:
https://soilstack.net/zone/{zone}#frost-dataset
where {zone} is one of 5a through 9b.
Each Dataset declares isBasedOn → the NCEI source
dataset, citation → the Palecki et al. (2021)
methodology paper, license → the NOAA Open Data
Dissemination URL, and isPartOf → this methodology
page (https://soilstack.net/frost/methodology#article),
along with three PropertyValue entries for the
P10/P50/P90 dates with the NCEI element keys documented.
This page emits a corresponding
schema.org/TechArticle
entity that lists the Dataset entities via mentions.
An AI engine or knowledge-graph crawler traversing either node
lands in a complete, self-consistent subnet for SoilStack's
NOAA-cited frost data.
Citing SoilStack
If you cite a SoilStack zone-page frost prediction in writing, please attribute the underlying data to NOAA NCEI and link this methodology page so readers can verify the chain. A workable short form:
"Last-spring-frost composite per
SoilStack's frost methodology
(derived from NOAA NCEI 1991–2020 Climate Normals,
gov.noaa.ncdc:C01619, Palecki et al. 2021)."
Or you can cite NCEI directly for the underlying data and skip us — the methodology paper handles the rigor on its own.
11Version history
This page is versioned. When the methodology itself materially changes, the version number bumps and a row is added below. Wording-only edits don't bump the version.
- v1.0 —
- Initial methodology release. Documents the Frost Prediction Engine V1 build: NOAA NCEI 1991–2020 Climate Normals as the sole data source, station-to-ZIP crosswalk via inverse-distance-squared Haversine weighting, per-ZIP composites via weighted circular mean, per-zone composites via equal-weight average. Documents the NCEI percentile convention bridge (Section 5).
Planned future versions: v2.0 will land with the V2 NWS short-range forecast blending (Section 5 will then split into V1 baseline + V2 blending). v2.1 will land with autumn first-frost extension (same methodology applied to autumn variables in the same NCEI dataset).