Introduction

In 2017 as faculty at Skidmore College, current Senior Director of Analytics of the NFL Michael Lopez wrote a paper with his coauthors Greg Matthews and Ben Baumer where they used gambling odds to determine how often the better team ends up winning across the four major North American sports. As a result of this paper, they estimated the individual home field advantage for each of the teams in four major sports leagues in North America. Overall the paper estimated the NFL to have the second largest home field advantage (after the NBA) of the four major professional North American sports.

While their Bayesian model ranked the home field advantage of each team, and found Denver (altitude anyone?) to have the highest posterior mean in each sport, for the NFL the 95% posterior intervals for all the teams overlapped, meaning the data was not indisputable that some teams have harder home fields than others.

I personally wrote on my blog in 2024 trying to estimate differences in college home field’s by fitting an expected points added (EPA) based power ranking with individual home field effects for each team as opposed to one overall one. This work found that the elite teams or typical “toughest” places to play as ranked by EA sport’s College Football video game franchise didn’t line up with the model because most of those toughest places to play in the video game were just the home fields of typically very good programs, not especially difficult crowds or stadiums (compared to any other college football stadium).

Such Bayesian power ranking based methods as mine or Lopez, Baumer, & Matthews 2017 paper can and will always rank a team’s home field advantage, but have a hard time finding in noisy game results posterior distributions of individual home field advantages that don’t overlap significantly.

Also in 2024 I looked at college basketball to see if you could do something similar. Like my college football work or the aforementioned “How often does the best team win?” paper, the number of teams with 95% posterior significance from an average home field advantage was right in line with what you would expect with multiple testing. But… I was able to show one specific aspect of home field advantage (free throws) were impacted in stadiums where the opposing student sections were directly behind the visiting team’s basket in the 2nd half.

Home Field Advantage by Year

The easiest way to estimate home field advantage by year is to build a model which accounts for the teams in each game, the scoring efficiences of each team, and the location of the game. This model can then be used to estimate the home field advantage for each year.

I will use expected points added (EPA) per play as the efficiency metric for each team in each game. I will use a Bayesian model to estimate the home field advantage for each year. I will ignore all QB kneels, spikes, and kickoffs in the model as these plays are not indicative of a team’s efficiency. The EPA model used is the one included with NFLFastR which is built via XGBoost. There may be better models out there, especially in edge situations, but this model is still very good and the most easy to access.

I will fit mixed model. In this format it acts like ridge regression, shrinking estimates to 0, which will be good for estimating season by season home field advantage and team effects. I will also put a spline on the difference for team rest, as I have found in the past that the effect of rest is not linear. Be sure to include separate team effects for offense and defense for each season.

# A tibble: 640 × 5
   team  season    off    def    tot
   <chr> <chr>   <dbl>  <dbl>  <dbl>
 1 ARI   2005    -1.25 -0.885  -2.13
 2 ARI   2006    -2.76 -2.67   -5.43
 3 ARI   2007    -3.32 -0.480  -3.80
 4 ARI   2008     1.38  0.531   1.91
 5 ARI   2009     3.67 -0.644   3.03
 6 ARI   2010   -10.4  -1.98  -12.3 
 7 ARI   2011    -8.55  0.232  -8.32
 8 ARI   2012    -9.34  2.64   -6.71
 9 ARI   2013     1.20  4.46    5.66
10 ARI   2014    -2.03  3.20    1.17
# ℹ 630 more rows

Here are the model’s estimated effects for the home team for 70 plays on offense in a game by season. Remember this is the effect of playing at home vs neutral, so home vs road would be double this effect for the offensive plays, but 4 times it for home vs away for all plays in a game.

Season	EPA vs Neutral 70 Plays	EPA vs Road per Game
2005	1.46	5.84
2006	0.07	0.30
2007	1.00	4.01
2008	1.01	4.04
2009	0.77	3.08
2010	−0.15	−0.60
2011	1.89	7.57
2012	0.41	1.63
2013	1.11	4.44
2014	1.20	4.81
2015	0.39	1.56
2016	1.74	6.97
2017	1.39	5.56
2018	1.13	4.53
2019	−0.96	−3.83
2020	−1.05	−4.21
2021	0.51	2.05
2022	0.96	3.84
2023	1.44	5.75
2024	0.10	0.41

The nfl is really a pretty small sample size each season, so there shouldn’t be a surprise it fluctuates in estimates the way it does, but overall we can see that there is a home field advantage over the long run.

So there is a home field, but it does fluctuate year to year. There are various theories as to what causes it but one such one is that the away team gets rattled or the ref calls more penalties on them. Let’s dive in.

Causes of Home Field Advantage

Now I want to look at the different play types in the NFL. I am going to focus on penalties. I will look at the different types of plays to familiarize myself with them in order to focus penalties.

Play Type	Play Type (NFL)	n
extra_point	XP_KICK	25266
field_goal	FIELD_GOAL	20772
field_goal	UNSPECIFIED	1
kickoff	FREE_KICK	1
kickoff	KICK_OFF	54603
kickoff	RUSH	2
no_play	FREE_KICK	22
no_play	KICK_OFF	198
no_play	PENALTY	46896
no_play	TIMEOUT	38466
no_play	UNSPECIFIED	2578
no_play	XP_KICK	102
pass	FUMBLE_RECOVERED_BY_OPPONENT	391
pass	INTERCEPTION	9744
pass	PASS	356662
pass	PAT2	1316
pass	SACK	25052
pass	UNSPECIFIED	199
punt	PUNT	48408
punt	UNSPECIFIED	42
qb_kneel	RUSH	8127
qb_kneel	UNSPECIFIED	1
qb_spike	PASS	1544
qb_spike	UNSPECIFIED	1
run	PASS	3
run	PAT2	493
run	RUSH	283682
run	UNSPECIFIED	351
NA	COMMENT	728
NA	END_GAME	5412
NA	END_QUARTER	16556
NA	GAME_START	5412
NA	KICK_OFF	1
NA	TIMEOUT	1
NA	UNSPECIFIED	179

It is clear that the only denotation of a penalty is in the play_type_nfl column. First I need to make sure this isn’t only penalties accepted but also penalties given (but declined).

# A tibble: 12 × 2
   play_type_nfl                    n
   <chr>                        <int>
 1 FIELD_GOAL                      72
 2 FREE_KICK                       23
 3 FUMBLE_RECOVERED_BY_OPPONENT    12
 4 INTERCEPTION                   509
 5 KICK_OFF                      2770
 6 PASS                          4815
 7 PENALTY                      45973
 8 PUNT                          4821
 9 RUSH                          4635
10 SACK                           332
11 UNSPECIFIED                   2593
12 XP_KICK                        510

I know what to know what fields there are corresponding to penalties already in the dataset.

first_down_penalty	penalty	penalty_team	penalty_player_id	penalty_player_name	penalty_yards	penalty_type
0	1	ARI	NA	NA	5	Illegal Shift
0	1	ARI	00-0020533	L.Davis	5	False Start
0	1	NYG	00-0019276	S.O'Hara	10	Offensive Holding
0	1	NYG	00-0021265	N.Greisen	10	Offensive Holding
0	1	ARI	00-0023510	E.Green	5	Offside on Free Kick
1	1	ARI	00-0022927	K.Dansby	6	Illegal Use of Hands
0	1	NYG	00-0019276	S.O'Hara	5	False Start
0	1	ARI	00-0023443	A.Rolle	5	Offside on Free Kick
0	1	ARI	00-0012333	C.Okeafor	5	Defensive Offside
1	1	NYG	00-0020006	A.Pierce	2	Defensive Pass Interference

A really quick exploratory data analysis would be to look and see the average penalties and penalty yards per game for a team at home vs away in a season in the NFL.

	season	posteam	Avg Penalties (Away)	Avg Penalties (Home)	Avg Penalty Yards (Away)	Avg Penalty Yards (Home)	Home vs Away Penalties	Home vs Away Penalty Yards
1.0	2024	CAR	5.6	8.0	45.1	61.2	2.4	16.1
2.0	2024	KC	5.4	5.8	40.8	55.8	0.4	15.0
3.0	2024	NYJ	6.4	6.8	50.1	62.5	0.4	12.4
4.0	2024	DET	4.6	4.9	32.9	44.0	0.3	11.1
5.0	2024	NYG	7.0	7.6	48.9	59.6	0.6	10.7
6.0	2024	LAC	4.8	6.0	39.1	49.6	1.2	10.5
7.0	2024	DEN	4.6	5.9	33.0	42.8	1.3	9.8
8.0	2024	WAS	5.1	7.3	45.8	54.8	2.2	9.0
9.0	2024	ATL	6.0	6.3	48.2	56.6	0.3	8.3
10.0	2024	NE	6.6	7.2	49.4	57.4	0.6	8.0
11.0	2024	SEA	6.9	8.1	54.1	62.1	1.2	8.0
12.0	2024	LA	4.8	5.7	43.8	50.2	0.9	6.4
13.0	2024	HOU	7.1	6.3	49.2	54.1	−0.8	4.9
14.0	2024	LV	5.7	5.8	44.0	42.5	0.1	−1.5
15.0	2024	CLE	7.1	7.9	62.9	59.5	0.8	−3.4
16.0	2024	GB	6.6	6.2	57.1	52.3	−0.4	−4.8
17.0	2024	PHI	5.6	6.4	47.4	42.5	0.7	−4.8
18.0	2024	NO	7.0	5.7	58.5	52.4	−1.3	−6.1
19.0	2024	SF	7.1	5.8	50.5	44.3	−1.3	−6.2
20.0	2024	MIN	7.2	6.5	63.5	57.2	−0.8	−6.2
21.0	2024	BAL	5.9	6.6	49.9	43.0	0.7	−6.9
22.0	2024	TEN	8.2	6.9	63.3	56.2	−1.3	−7.1
23.0	2024	DAL	8.4	7.7	65.6	58.0	−0.7	−7.6
24.0	2024	CHI	7.5	7.1	59.2	50.4	−0.4	−8.9
25.0	2024	JAX	5.2	4.9	44.6	35.7	−0.4	−8.9
26.0	2024	BUF	7.4	6.3	60.9	50.6	−1.1	−10.3
27.0	2024	ARI	7.4	5.3	57.0	46.7	−2.0	−10.3
28.0	2024	TB	6.2	6.3	61.2	50.3	0.0	−11.0
29.0	2024	PIT	6.8	6.4	63.4	51.5	−0.4	−11.9
30.0	2024	MIA	8.3	6.0	70.6	53.5	−2.3	−17.1
31.0	2024	CIN	7.2	5.9	57.8	40.6	−1.4	−17.2
32.0	2024	IND	5.3	5.0	60.2	37.2	−0.3	−23.0

While over half the teams have more penalty yards when on the road than at home in 2024, I was quite surprised there were 13 teams that had more penalty yards per game at home than on the road. Perhaps it is variance to the sample size of 8-9 games for a team in each location per season, but over the 20 year average of this dataset the teams on average have 1.6 more penalty yards per game and 0.12 penalties per game. That goes against everything we would expect from home field advantage regarding penalties. Let’s look at this on a year by year basis:

The empirical trend to the naked eye appears to be an inverse parabola, but perhaps we need to account for more things than just the overall averages which leads us to a model.

Now I want to see if there is an impact on penalty yards per game based on the home field advantage but I want to be sure to include a team’s typical penalty rate in a game, the NFL trend for that season, the opponents penalty yards given up per game, and perhaps other factors such as the predicted score which I will derive from the betting spread and total.

So I need to take the penalty plays and aggregate them up to the game level, then add in various game specific information at the end such as the predicted spread and total.

Now I need to adjust game variables to be from the possessing team’s perspective instead of the home team’s perspective.

The last piece of information I want to include into my dataset is the number of plays excluding kneels and spikes in the game so we can normalize the number of penalties by how many “opportunities” there were for a penalty to be called.

Let’s look at the results.

Variable	Estimate	Standard Error	t-value	p-value
(Intercept)	21.71	1.57	13.82	0.00
posteam_sitehome	0.95	0.28	3.36	0.00
posteam_siteneutral	1.59	1.11	1.43	0.15
bs(team_rest_effect, df = 3)1	2.53	2.66	0.95	0.34
bs(team_rest_effect, df = 3)2	−1.39	1.70	−0.82	0.41
bs(team_rest_effect, df = 3)3	0.50	1.38	0.36	0.72
indoors	−0.13	0.34	−0.38	0.71
grass	−0.02	0.30	−0.07	0.95
team_pred_pt_diff	0.04	0.02	1.85	0.06
total_line	0.09	0.03	2.95	0.00

Interestingly enough, the model suggests that teams actually get about 0.95 more penalty yards per 70 plays at home than on the road after adjusting for these other factors. But since the indoors, team_rest and grass are nowhere near significant, I will remove them and refit the model.

Variable	Estimate	Standard Error	t-value	p-value
(Intercept)	22.25	1.26	17.65	0.00
posteam_sitehome	0.97	0.28	3.43	0.00
posteam_siteneutral	1.62	1.11	1.46	0.14
team_pred_pt_diff	0.04	0.02	1.90	0.06
total_line	0.08	0.03	2.93	0.00

The result is pretty much the same. Teams get about 0.97 more penalties at home than on the road after adjusting for the predicted spread and point total. This was certainly a surprise to me.

Referees

One last thing that may be worth looking at is using the same model we just had but adding a random effect for the referee. Perhaps some referee crews call more penalties than others.

The referee model didn’t change the effect of the other variables much but it does provide us some estimates for referee effects.

Effect	Group	Variable	Estimate	Standard Error	t-value
fixed	NA	(Intercept)	22.46	1.30	17.27
fixed	NA	posteam_sitehome	0.97	0.28	3.47
fixed	NA	posteam_siteneutral	1.53	1.10	1.39
fixed	NA	team_pred_pt_diff	0.04	0.02	1.93
fixed	NA	total_line	0.08	0.03	2.76
ran_pars	referee	sd__(Intercept)	1.64	NA	NA
ran_pars	Residual	sd__Observation	13.55	NA	NA

Here are the top 10 referees in more penalty yards per 100 plays called in their games.

Referee	Penalty Yards per 70 Plays	Standard Error
Jeff Triplette	2.24	0.63
Terry McAulay	1.65	0.62
Carl Cheffers	1.55	0.55
Tom White	1.55	1.36
Walt Anderson	1.52	0.59
Bradley Rogers	1.46	1.36
Bruce Hermansen	1.23	1.57
Ron Winter	1.22	0.73
Alex Kemp	1.21	0.79
Bernie Kukar	1.16	1.36

And here ate the bottom 10 referees in more penalty yards per 70 plays called in their games.

Referee	Penalty Yards per 70 Plays	Standard Error
Bill Vinovich	−3.03	0.58
Alberto Riveron	−2.73	0.92
Gene Steratore	−2.44	0.65
Bill Carollo	−2.28	0.97
Scott Novak	−2.14	0.85
Scott Green	−2.08	0.74
Peter Morelli	−1.56	0.60
Alan Eck	−1.27	1.17
Don Carey	−1.22	1.37
Bill Leavy	−1.09	0.70

Clearly some referee crews call more or fewer penalty yards per 70 plays than others. The overall spread of the estimates is -3.03 to 2.24 yards per 70 plays. Since a typical game is about 140 plays, the most effect, on average, we estimate a referee can have on a game is a difference of 10.54 yards penalty yards from the most to least penalizing referees.

Do Kansas City opponents get called for more penalties?

Well that one is really impossible to determine in the data. Why?

Even if the Chiefs have more penalties called on their opponents than other teams, it could be be due for a variety of causal reasons such as:

The Chiefs are usually winning so teams are more desparate, leading to more penalties.
Andy Reid has coached for decades so he knows how to avoid getting penalties called on his team.
Opponents lose their cool due to the pressure of playing the Chiefs dynasty.
… and yes, if their opponents did get more calls then it could theoretically be due to referee help.

We can’t assign a cause, but do we even know if their opponents are penalized more? Well we can fit a model for that where we try to predict penalty yards for the opponent based on

the current team,
the opponent,
the referee,
and a normalizing intercept for the season.

Now we fit the mixed effects regression but I will only use data since the 2019 season, the first of the Patrick Mahomes’ Super Bowl seasons. I estimate a random effect for all teams and opponent penalty yard effects and a random effect for the referee. I will also adjust for expected the closeness of the game by including the betting lines and totals.

Let’s look at the fixed effects:

# A tibble: 13 × 4
   term                   estimate std.error statistic
   <chr>                     <dbl>     <dbl>     <dbl>
 1 (Intercept)               24.6       2.85      8.63
 2 factor(season)2020        -5.01      0.84     -5.99
 3 factor(season)2021        -3.18      0.81     -3.91
 4 factor(season)2022        -5.52      0.82     -6.75
 5 factor(season)2023        -4.87      0.83     -5.89
 6 factor(season)2024        -1.95      0.82     -2.38
 7 spread_line               -0.45      0.43     -1.05
 8 total_line                 0.1       0.06      1.63
 9 spread_line:total_line     0.01      0.01      0.96
10 sd__(Intercept)            1.06     NA        NA   
11 sd__(Intercept)            1.39     NA        NA   
12 sd__(Intercept)            1.26     NA        NA   
13 sd__Observation           13.2      NA        NA

And the random effects for teams, sorted by the teams that get biggest increase in opponent penalty yards per 70 plays to the least.

# A tibble: 33 × 3
   team  estimate std.error
   <chr>    <dbl>     <dbl>
 1 TEN      2.01      0.961
 2 OAK      1.72      1.28 
 3 NYJ      1.71      0.968
 4 DAL      1.65      0.959
 5 NO       1.39      0.962
 6 CLE      1.04      0.961
 7 BAL      0.742     0.952
 8 LV       0.586     1.01 
 9 DET      0.581     0.959
10 JAX      0.540     0.964
# ℹ 23 more rows

Takeaways

The NFL has a home field advantage, but it fluctuates year to year due to the small sample size in the NFL. The cause of that home field advantage has long been studied but is hard to pin down. One theory that it is due to referee bias doesn’t appear to be the case. The average team gets about 1.2 more penalty yards per 100 plays at home than on the road after adjusting for the predicted spread and total.

We also looked at individual referees to see how much they impact the game. It turns out that on average there is only an estimated 8.6 yards from penalty difference from the most to least penalizing referees.

--- title: "Home Field Advantage in the NFL" date: 2025-01-31 author: "Paul Sabin" description: "Has it shrunk? Is it caused by penalties?" categories: - NFL - football - home field advantage - sports analytics - EPA image: "images/nfl_logo.png" twitter-card: image: "images/nfl_logo.png" open-graph: image: "images/nfl_logo.png" format: html: code-fold: true code-summary: "Show the code" editor: visual execute: echo: false warning: false error: false message: false cache: false --- ```{r} library(tidyverse) # library(cfbfastR) library(lubridate) library(splines) library(rstanarm) library(broom.mixed) library(rstan) library(rstanarm) library(kableExtra) library(viridis) library(tidybayes) library(gtExtras) library(ggimage) library(nflfastR) # source("r/get_gamma_parameters.R") source("r/haversine_distance.R") source("r/get_utc_offset.R") options(tibble.width = Inf) options(mc.cores = parallel::detectCores()) options(scipen= 999) current_year <- 2024 historical_seasons <- 20 all_seasons <- (current_year - historical_seasons + 1):current_year #function to convert team abbreviations with name changes for PBP data source("r/convert_team_abbreviation.R") ``` ```{r} # nfl_participation <- nflreadr::load_participation(run_season) nfl_rosters <- nflreadr::load_rosters_weekly(all_seasons) # nfl_contracts <- nflreadr::load_contracts() nfl_schedule <- nflreadr::load_schedules() nfl_player_stats <- load_player_stats(all_seasons) nfl_pbp <- load_pbp(all_seasons) ## add a indicator for home vs neutral vs away team (1, 0, -1) nfl_pbp <- nfl_pbp |> mutate(posteam_site_ind = case_when(location == "Neutral" ~ 0, posteam == home_team ~ 1, posteam == away_team ~ -1, TRUE ~ NA_real_), posteam_site = case_when(location == "Neutral" ~ "neutral", posteam == home_team ~ "home", posteam == away_team ~ "away", TRUE ~ NA_character_) ) #adjust team names for pbp data when teams move or change names nfl_pbp <- convert_team_abbreviation(data = nfl_pbp, col_name = 'posteam', team_conversion) nfl_pbp <- convert_team_abbreviation(data = nfl_pbp, col_name = 'defteam', team_conversion) ``` # Introduction In 2017 as faculty at Skidmore College, current Senior Director of Analytics of the NFL Michael Lopez wrote a paper with his coauthors Greg Matthews and Ben Baumer where they used gambling odds to determine how often the better team ends up winning across the four major North American sports. As a result of this paper, they estimated the individual home field advantage for each of the teams in four major sports leagues in North America. Overall the paper estimated the NFL to have the second largest home field advantage (after the NBA) of the four major professional North American sports. While their Bayesian model ranked the home field advantage of each team, and found Denver (altitude anyone?) to have the highest posterior mean in each sport, for the NFL the 95% posterior intervals for all the teams overlapped, meaning the data was not indisputable that some teams have harder home fields than others. ![HFA Estimates from 'How Often Does the Best Team Win?'](images/best_team_win.png)I personally wrote on [my blog in 2024](https://sabinanalytics.com/blog/2024/07:cfb-hfa-comparison/) trying to estimate differences in college home field's by fitting an expected points added (EPA) based power ranking with individual home field effects for each team as opposed to one overall one. This work found that the elite teams or typical "toughest" places to play as ranked by EA sport's College Football video game franchise didn't line up with the model because most of those toughest places to play in the video game were just the home fields of typically very good programs, not especially difficult crowds or stadiums (compared to any other college football stadium). Such Bayesian power ranking based methods as mine or Lopez, Baumer, & Matthews 2017 paper can and will always *rank* a team's home field advantage, but have a hard time finding in noisy game results posterior distributions of individual home field advantages that don't overlap significantly. Also in 2024 I looked at [college basketball](https://sabinanalytics.com/blog/2024/02:28:do-college-crowds-affect-free-throws/) to see if you could do something similar. Like my college football work or the aforementioned "How often does the best team win?" paper, the number of teams with 95% posterior significance from an average home field advantage was right in line with what you would expect with multiple testing. *But...* I was able to show one specific aspect of home field advantage (free throws) were impacted in stadiums where the opposing student sections were directly behind the visiting team's basket in the 2nd half. ## Home Field Advantage by Year The easiest way to estimate home field advantage by year is to build a model which accounts for the teams in each game, the scoring efficiences of each team, and the location of the game. This model can then be used to estimate the home field advantage for each year. I will use expected points added (EPA) per play as the efficiency metric for each team in each game. I will use a Bayesian model to estimate the home field advantage for each year. I will ignore all QB kneels, spikes, and kickoffs in the model as these plays are not indicative of a team's efficiency. The EPA model used is the one included with NFLFastR which is built via XGBoost. There may be better models out there, especially in edge situations, but this model is still very good and the most easy to access. ```{r} nfl_epa_game <- nfl_pbp |> filter(#!is.na(play_type), !play_type %in% c("qb_kneel", "qb_spike", "kickoff"), between(wp, 0.05, 0.95), !is.na(posteam), !is.na(epa), # special == 0 ) |> group_by(game_id, season, posteam, defteam, posteam_site_ind, posteam_site) |> summarize(epa_game = sum(epa), epa_play = mean(epa), plays = n()) |> #adjust each to be "vs average" in each season group_by(season) |> mutate(epa_play = epa_play - weighted.mean(epa_play, w = plays), epa_game = epa_game - weighted.mean(epa_game, w = plays), epa_per_70 = epa_play*70) |> ungroup() # add in rest and other factors like outdoors grass nfl_epa_game <- nfl_epa_game |> left_join(nfl_schedule |> dplyr::select(game_id, home_team, away_team, roof, surface, referee, stadium_id, stadium, home_rest, away_rest), by = c("game_id") ) |> mutate(team_rest = ifelse(posteam == home_team, home_rest, away_rest), opp_rest = ifelse(posteam == home_team, away_rest, home_rest), team_rest_effect = team_rest - opp_rest, indoors = ifelse(roof == "dome" | roof == "closed", 1, 0), grass = ifelse(str_detect(surface, "grass"), 1, 0), ) ``` I will fit mixed model. In this format it acts like ridge regression, shrinking estimates to 0, which will be good for estimating season by season home field advantage and team effects. I will also put a spline on the difference for team rest, as I have found in the past that the effect of rest is not linear. Be sure to include separate team effects for offense and defense for each season. ```{r} #so the bayesian model runs in parallel options(mc.cores = parallel::detectCores()) # nfl_epa_game <- nfl_epa_game |> mutate(team_season = paste0(posteam, "_", season), defteam_season = paste0(defteam, "_", season) ) library(lme4) epa_game_model <- lmer(epa_per_70 ~ bs(team_rest_effect, df = 3) + indoors + grass + (1 | team_season) + (1 | defteam_season) + factor(season) + factor(season):posteam_site_ind, data = nfl_epa_game) tidy(epa_game_model, effects = 'ran_vals') |> filter(str_detect(group, "team_season|defteam_season")) |> mutate(side = case_when(group == "team_season" ~ "off", group == "defteam_season" ~ "def", TRUE ~ NA_character_)) |> filter(!is.na(side)) |> mutate(team = str_extract(level, ".+_") |> str_remove("_"), season = str_extract(level, "[[:digit:]]{4}"), estimate = ifelse(side == "def", -estimate, estimate)) |> dplyr::select(team, season, side, estimate) |> pivot_wider(values_from = "estimate", names_from = "side") |> mutate(tot = off + def) ``` Here are the model's estimated effects for the home team for 70 plays on offense in a game by season. Remember this is the effect of playing at home vs neutral, so home vs road would be double this effect for the offensive plays, but 4 times it for home vs away for all plays in a game. ```{r} home_effects <- tidy(epa_game_model) |> filter(str_detect(term, "posteam_site")) |> mutate(season = str_extract(term, "\\d{4}"), home_vs_away_full_game = estimate*4) home_effects |> dplyr::select(season, estimate, home_vs_away_full_game) |> gt() |> cols_label( estimate = "EPA vs Neutral 70 Plays", season = "Season", home_vs_away_full_game = "EPA vs Road per Game" ) |> #round last 2 columns to 2 digits fmt_number( columns = where(is.numeric) & !matches("season"), decimals = 2 ) ``` The nfl is really a pretty small sample size each season, so there shouldn't be a surprise it fluctuates in estimates the way it does, but overall we can see that there is a home field advantage over the long run. ```{r} avg_hfa <- home_effects |> summarize(avg_hfa = mean(estimate)) home_effects |> ggplot(aes(x = as.numeric(season), y = home_vs_away_full_game)) + geom_line() + geom_hline(data = avg_hfa, aes(yintercept = avg_hfa), linetype = 2, col = 'black') + geom_hline(yintercept = 0, linetype = 2, color = 'red') + theme_minimal() + labs(title = "Home Field Advantage by Year", subtitle = "2005-2024", y = "EPA/game", x = "Season") ``` So there is a home field, but it does fluctuate year to year. There are various theories as to what causes it but one such one is that the away team gets rattled or the ref calls more penalties on them. Let's dive in. ## Causes of Home Field Advantage  Now I want to look at the different play types in the NFL. I am going to focus on penalties. I will look at the different types of plays to familiarize myself with them in order to focus penalties. ```{r} nfl_pbp |> count(play_type, play_type_nfl) |> gt() |> cols_label( play_type = "Play Type", play_type_nfl = "Play Type (NFL)" ) ``` It is clear that the only denotation of a penalty is in the `play_type_nfl` column. First I need to make sure this isn't only penalties accepted but also penalties given (but declined). ```{r} nfl_pbp |> filter(!is.na(penalty), penalty == 1) |> count(play_type_nfl) ``` I know what to know what fields there are corresponding to penalties already in the dataset. ```{r} nfl_pbp |> filter(!is.na(penalty), penalty == 1) |> select(contains("penalty")) |> slice(1:10) |> gt() ``` A really quick exploratory data analysis would be to look and see the average penalties and penalty yards per game for a team at home vs away in a season in the NFL. ```{r} avg_penalty_by_site <- nfl_pbp |> filter(!is.na(penalty), penalty == 1) |> group_by(season, posteam, posteam_site, game_id) |> summarize(penalties = sum(penalty), penalty_yards = sum(penalty_yards)) |> group_by(season, posteam, posteam_site) |> summarize(avg_penalties = mean(penalties), avg_penalty_yards = mean(penalty_yards), .groups = 'drop') |> pivot_wider(names_from = posteam_site, values_from = c(avg_penalties, avg_penalty_yards)) |> mutate(home_vs_away_penalties = avg_penalties_home - avg_penalties_away, home_vs_away_penalty_yards = avg_penalty_yards_home - avg_penalty_yards_away) avg_penalty_by_site |> filter(season == 2024) |> arrange(desc(home_vs_away_penalty_yards)) |> mutate(row_num = 1:n()) |> dplyr::select(-contains("neutral")) |> dplyr::select(row_num, everything()) |> gt() |> cols_align(align = "right", columns = everything()) |> #round to one decimal in all numeric columns fmt_number( columns = where(is.numeric) & !matches("season"), decimals = 1 ) |> cols_label( row_num = "", avg_penalties_home = "Avg Penalties (Home)", avg_penalties_away = "Avg Penalties (Away)", avg_penalty_yards_home = "Avg Penalty Yards (Home)", avg_penalty_yards_away = "Avg Penalty Yards (Away)", home_vs_away_penalties = "Home vs Away Penalties", home_vs_away_penalty_yards = "Home vs Away Penalty Yards" ) |> #color the negative values red and positive green # Apply conditional styling for home_vs_away_penalties tab_style( style = list( cell_text(color = "red") # Red text for negative values ), locations = cells_body( columns = home_vs_away_penalties, rows = home_vs_away_penalties < 0 # Apply to rows where Value1 is negative ) ) |> tab_style( style = list( cell_text(color = "blue") # Blue text for positive values ), locations = cells_body( columns = home_vs_away_penalties, rows = home_vs_away_penalties > 0 # Apply to rows where Value1 is positive ) ) |> # Apply conditional styling for home_vs_away_penalty_yards tab_style( style = list( cell_text(color = "red") # Red text for negative values ), locations = cells_body( columns = home_vs_away_penalty_yards, rows = home_vs_away_penalty_yards < 0 # Apply to rows where Value1 is negative ) ) |> tab_style( style = list( cell_text(color = "blue") # Blue text for positive values ), locations = cells_body( columns = home_vs_away_penalty_yards, rows = home_vs_away_penalty_yards > 0 # Apply to rows where Value1 is positive ) ) ``` While over half the teams have more penalty yards when on the road than at home in 2024, I was quite surprised there were 13 teams that had more penalty yards per game at home than on the road. Perhaps it is variance to the sample size of 8-9 games for a team in each location per season, but over the 20 year average of this dataset the teams on average have `r round(mean(avg_penalty_by_site$home_vs_away_penalty_yards),1)` more penalty yards per game and `r round(mean(avg_penalty_by_site$home_vs_away_penalties),2)` penalties per game. That goes against everything we would expect from home field advantage regarding penalties. Let's look at this on a year by year basis: ```{r} nfl_ssn_avg_penalty_by_site <- avg_penalty_by_site |> group_by(season) |> summarize(avg_home_vs_away_penalty_yards = mean(home_vs_away_penalty_yards), avg_home_vs_away_penalties = mean(home_vs_away_penalties), .groups = 'drop') nfl_ssn_avg_penalty_by_site |> ggplot() + geom_line(aes(x = season, y = avg_home_vs_away_penalty_yards)) + geom_hline(yintercept = 0, linetype = 2, color = 'red') + theme_minimal() + labs(title = "Average Home vs Away Penalty Yards per Game", subtitle = "2005-2024", y = "Average Difference per Game", x = "Season") nfl_ssn_avg_penalty_by_site |> ggplot() + geom_line(aes(x = season, y = avg_home_vs_away_penalty_yards)) + geom_hline(yintercept = 0, linetype = 2, color = 'red') + theme_minimal() + labs(title = "Average Home vs Away Penalties per Game", subtitle = "2005-2024", y = "Average Difference per Game", x = "Season") ``` The empirical trend to the naked eye appears to be an inverse parabola, but perhaps we need to account for more things than just the overall averages which leads us to a model. ```{r} ## pull out relevant pbp variables per play nfl_penalty_pbp <- nfl_pbp |> filter(!is.na(penalty), penalty == 1) |> dplyr::select(season, play_id, game_id, posteam, defteam, game_date, posteam_site, posteam_site_ind, play_type, play_type_nfl,#PENALTY an important playtype penalty, penalty_type, penalty_team, penalty_yards, ep, epa, wp, wpa, yards_gained, half_seconds_remaining, game_half, down, ydstogo, yardline_100, posteam_score, defteam_score, qb_kneel, qb_spike, qb_dropback, rush_attempt, pass_attempt, punt_attempt, extra_point_attempt, field_goal_attempt, kickoff_attempt, special_teams_play) ``` Now I want to see if there is an impact on penalty yards per game based on the home field advantage but I want to be sure to include a team's typical penalty rate in a game, the NFL trend for that season, the opponents penalty yards given up per game, and perhaps other factors such as the predicted score which I will derive from the betting spread and total. So I need to take the penalty plays and aggregate them up to the game level, then add in various game specific information at the end such as the predicted spread and total. ```{r} nfl_penalty_game_summary <- nfl_penalty_pbp |> group_by(season, posteam, posteam_site, game_id, game_date, ) |> summarize(penalties = sum(penalty), penalty_yards = sum(penalty_yards), .groups = 'drop') |> left_join(nfl_schedule |> dplyr::select(game_id, home_team, away_team, total_line, spread_line, roof, surface, referee, stadium_id, stadium, home_rest, away_rest), by = c("game_id") ) ``` Now I need to adjust game variables to be from the possessing team's perspective instead of the home team's perspective. ```{r} nfl_penalty_game_summary <- nfl_penalty_game_summary |> mutate(team_rest = ifelse(posteam == home_team, home_rest, away_rest), opp_rest = ifelse(posteam == home_team, away_rest, home_rest), team_rest_effect = team_rest - opp_rest, team_pred_pt_diff = ifelse(posteam == home_team, -spread_line, spread_line), indoors = ifelse(roof == "dome" | roof == "closed", 1, 0), grass = ifelse(str_detect(surface, "grass"), 1, 0), ) ``` The last piece of information I want to include into my dataset is the number of plays excluding kneels and spikes in the game so we can normalize the number of penalties by how many "opportunities" there were for a penalty to be called. ```{r} nfl_plays_by_game <- nfl_pbp |> filter(play_type %in% c("pass", "run", "punt", "field_goal")) |> filter(!play_type %in% c("qb_kneel", "qb_spike", "kickoff")) |> group_by(game_id) |> summarize(plays = n()) nfl_penalty_game_summary <- nfl_penalty_game_summary |> left_join(nfl_plays_by_game, by = 'game_id') |> mutate(penalty_yards_per_70_plays =( penalty_yards / plays) * 70) penalty_model <- lm(penalty_yards_per_70_plays ~ posteam_site + bs(team_rest_effect, df = 3) + indoors + grass + team_pred_pt_diff + total_line, data = nfl_penalty_game_summary) ``` Let's look at the results. ```{r} tidy(penalty_model) |> gt() |> cols_label( term = "Variable", estimate = "Estimate", std.error = "Standard Error", statistic = "t-value", p.value = "p-value" ) |> fmt_number( columns = where(is.numeric) & !matches("season"), decimals = 2 ) ``` Interestingly enough, the model suggests that teams actually get about 0.95 more penalty yards per 70 plays at home than on the road after adjusting for these other factors. But since the `indoors`, `team_rest` and `grass` are nowhere near significant, I will remove them and refit the model. ```{r} penalty_model2 <- lm(penalty_yards_per_70_plays ~ posteam_site + team_pred_pt_diff + total_line, data = nfl_penalty_game_summary) tidy(penalty_model2) |> gt() |> cols_label( term = "Variable", estimate = "Estimate", std.error = "Standard Error", statistic = "t-value", p.value = "p-value" ) |> fmt_number( columns = where(is.numeric) & !matches("season"), decimals = 2 ) ``` The result is pretty much the same. Teams get about 0.97 more penalties at home than on the road after adjusting for the predicted spread and point total. This was certainly a surprise to me. ### Referees One last thing that may be worth looking at is using the same model we just had but adding a random effect for the referee. Perhaps some referee crews call more penalties than others. ```{r} referee_penalty_model <- lmer(penalty_yards_per_70_plays ~ posteam_site + (1|referee) + team_pred_pt_diff + total_line, data = nfl_penalty_game_summary) ``` The referee model didn't change the effect of the other variables much but it does provide us some estimates for referee effects. ```{r} tidy(referee_penalty_model) |> gt() |> cols_label( effect = "Effect", group = "Group", term = "Variable", estimate = "Estimate", std.error = "Standard Error", statistic = "t-value" ) |> fmt_number( columns = where(is.numeric) & !matches("season"), decimals = 2 ) ``` Here are the top 10 referees in more penalty yards per 100 plays called in their games. ```{r} referee_effects <- tidy(referee_penalty_model, effect = "ran_vals") |> arrange(desc(estimate)) |> rename(yards_per_70 = estimate, referee = level) referee_effects |> slice(1:10) |> dplyr::select(referee, yards_per_70, std.error) |> gt() |> cols_label( referee = "Referee", yards_per_70 = "Penalty Yards per 70 Plays", std.error = "Standard Error" ) |> #round last 2 columns to 2 digits fmt_number( columns = where(is.numeric) & !matches("season"), decimals = 2 ) ``` And here ate the bottom 10 referees in more penalty yards per 70 plays called in their games. ```{r} referee_effects |> arrange(yards_per_70) |> slice(1:10) |> dplyr::select(referee, yards_per_70, std.error) |> gt() |> cols_label( referee = "Referee", yards_per_70 = "Penalty Yards per 70 Plays", std.error = "Standard Error" ) |> #round last 2 columns to 2 digits fmt_number( columns = where(is.numeric) & !matches("season"), decimals = 2 ) ``` Clearly some referee crews call more or fewer penalty yards per 70 plays than others. The overall spread of the estimates is -3.03 to 2.24 yards per 70 plays. Since a typical game is about 140 plays, the most effect, on average, we estimate a referee can have on a game is a difference of 10.54 yards penalty yards from the most to least penalizing referees. ## Do Kansas City opponents get called for more penalties? Well that one is really impossible to determine in the data. Why? Even if the Chiefs have more penalties called on their opponents than other teams, it could be be due for a variety of causal reasons such as: 1. The Chiefs are usually winning so teams are more desparate, leading to more penalties. 2. Andy Reid has coached for decades so he knows how to avoid getting penalties called on his team. 3. Opponents lose their cool due to the pressure of playing the Chiefs dynasty. 4. ... and yes, if their opponents did get more calls then it ***could theoretically*** be due to referee help. We can't assign a cause, but do we even know if their opponents are penalized more? Well we can fit a model for that where we try to predict penalty yards for the opponent based on - the current team, - the opponent, - the referee, - and a normalizing intercept for the season. ```{r} #calculate opp penalty yards per 70 plays nfl_penalty_game_summary <- nfl_penalty_game_summary |> group_by(game_id) |> mutate(opp_penalty_yds_per_70 = sum(penalty_yards_per_70_plays) - penalty_yards_per_70_plays) |> ungroup() ``` Now we fit the mixed effects regression but I will only use data since the 2019 season, the first of the Patrick Mahomes' Super Bowl seasons. I estimate a random effect for all teams and opponent penalty yard effects and a random effect for the referee. I will also adjust for expected the closeness of the game by including the betting lines and totals. ```{r} nfl_penalty_game_summary_recent <- nfl_penalty_game_summary |> filter(season >= 2019) |> mutate(opp = ifelse(posteam == home_team, away_team, home_team)) |> rename(team = posteam) team_penalty_help_mod <- lmer(opp_penalty_yds_per_70 ~ factor(season) + spread_line*total_line + (1|referee) + (1|team) + (1|opp), data = nfl_penalty_game_summary_recent ) ``` Let's look at the fixed effects: ```{r} tidy(team_penalty_help_mod) |> dplyr::select(term:last_col()) |> mutate_at(vars(estimate:last_col()), ~round(.x, digits = 2)) ``` And the random effects for teams, sorted by the teams that get biggest increase in opponent penalty yards per 70 plays to the least. ```{r} tidy(team_penalty_help_mod, effects = "ran_vals") |> filter(group == "team") |> dplyr::select(team = level, estimate, std.error) |> arrange(desc(estimate)) ``` # Takeaways The NFL has a home field advantage, but it fluctuates year to year due to the small sample size in the NFL. The cause of that home field advantage has long been studied but is hard to pin down. One theory that it is due to referee bias doesn't appear to be the case. The average team gets about 1.2 more penalty yards per 100 plays at **home** than on the road after adjusting for the predicted spread and total. We also looked at individual referees to see how much they impact the game. It turns out that on average there is only an estimated 8.6 yards from penalty difference from the most to least penalizing referees.