How Our HR Picks Work
Every ranking on this site comes from a deterministic probability model. No vibes, no hot takes, no "locks of the day." This page explains exactly what goes into the picks — and, just as importantly, what the model can't do.
The One-Sentence Summary
For every batter-pitcher-park matchup today, we estimate the probability that the batter hits at least one home run in the game. We rank everyone by that probability, and the top of the list becomes "today's picks."
The Inputs
Five categories of data feed the model, in rough order of importance:
- Batter HR rate (season + last 15 games). Season rate gets 70% weight, recent form gets 30%. Players with fewer than 50 PA fall back to league average until they accumulate a sample.
- Pitcher HR/9 allowed. How often this pitcher gives up home runs, normalized per 9 innings. Pitchers with fewer than 15 IP fall back to league average.
- Statcast quality (barrel %). Barrels are batted balls hit with both ideal exit velocity and launch angle — they're the strongest single predictor of future home run production. We use both the batter's barrel rate and the pitcher's barrel rate allowed.
- Handedness splits. Batter OPS vs. the pitcher's hand, and pitcher wOBA allowed vs. the batter's stance. Only applied when sample sizes are reliable (40+ PA).
- Park factor. Each MLB park has a HR factor for LHB and RHB separately. 100 is neutral, Coors Field is ~120, Petco is ~90. Where available, we also factor in temperature and wind.
The Math (Short Version)
We start with the league average HR rate per plate appearance (~3.2%), then apply a series of multipliers:
expected HR per PA =
league_avg × batter_rate_mult × pitcher_HR9_mult
× √(batter_barrel_mult × pitcher_barrel_mult)
× handedness_mult × park_mult × weather_mult
P(at least 1 HR in game) = 1 − (1 − expected_HR_per_PA) ^ expected_PAThe square root on the barrel adjustment dampens extremes — a batter with 2x league-average barrel rate doesn't mean 2x HRs, because not every barrel clears the fence. The final probability is capped between 2% and 30%, because in real MLB history, no matchup has ever truly exceeded those bounds.
What "Confidence" Means
Each pick is tagged High, Medium, or Low confidence. This is about data quality, not certainty of the outcome:
- High: Large samples on both sides, consistent signals across factors.
- Medium: One small sample or one contradictory factor.
- Low: Multiple small samples or conflicting signals. Treat these as flyers.
Even a high-confidence pick with a 15% HR probability will miss 85% of the time. That's not the model being wrong — that's how rare home runs are.
What the Model Does Not Do
Transparency matters here. The current model has known limitations:
- No pitch-level data. We use season aggregates, not pitch-type matchups. A batter who crushes sliders facing a slider-heavy pitcher isn't identified.
- No lineup position. Leadoff vs. 9-hole changes expected PA, but we use a flat 4.0 estimate. Adding lineup position is on the roadmap.
- No bullpen factors. If a starter gets pulled in the 4th and the reliever is elite, the model doesn't adjust.
- No injury adjustments. A batter playing through a wrist injury looks identical to one in peak form. This will always be a limitation without proprietary info.
- BvP history is not used. We looked at whether head-to-head batter-vs-pitcher history adds predictive value. In samples under ~25 PA (which is nearly all of them), it doesn't. Including it made predictions worse, not better. So we leave it out.
How We Know It Works (Backtesting)
Every change to the model is validated by running it against historical games before shipping. We measure three things:
- Calibration. When the model says "10% HR probability," the actual rate for those players should be ~10%. If it's 5% or 15%, the model is broken.
- Ranking lift. Our top 5 picks per day should hit HRs at well above the league average rate. Current target: 2x+ baseline. Synthetic testing shows 1.84x, real-data testing is ongoing.
- Brier score. A standard measure of probabilistic forecast accuracy. Lower is better. We publish it.
Our current calibration and backtest results are published on the calibration page, updated weekly. When the model misses, you'll see it.
The Honest Limits of Any HR Model
Home runs are rare events. Even the hottest hitter facing the most homer-prone pitcher in a bandbox with the wind blowing out caps somewhere around a 20-25% game HR probability. That means:
- The best picks miss 75-80% of the time.
- A single-day sample tells you almost nothing.
- Meaningful accuracy only shows up over hundreds of predictions.
If a site promises "guaranteed picks" or shows win rates above 40% on individual HR props, they're either cherry-picking, outright lying, or both. We'd rather tell you the math is imperfect and show you the receipts.
What We're Working On
- Incorporating bullpen HR vulnerability past the 5th inning.
- Lineup position → expected PA adjustments.
- Pitch-type matchup data from Baseball Savant.
- Better recent-form models that distinguish real trends from noise.
- Similar-pitcher analogues for small-sample starters (rookies, call-ups).
Questions, criticisms, or think you've spotted a bug? We want to hear about it. Email [feedback address] or open an issue on our public methodology repo.