Final Post: 2024 Presidential Forecast

Introductory Note

Over the course of nine weeks, we’ve explored the power of the economy, incumbency, polling, demographics, and certain campaign data in predicting election outcomes. For the past several weeks, my model has predicted the same outcome: Harris wins Wisconsin, Michigan and Pennsylvania, while Trump takes the rest. My national popular vote prediction has steadily narrowed, with Trump now within a percentage point of Harris. If anything has become clear to me over the course of this process, it is that this race is truly a coin flip. I’m excited to see how my predictions fare.

Model Breakdown

In forecasting the 2024 Presidential Election, I created two separate models, one to forecast the national popular vote and another to predict the electoral college. Both of my forecasts rely primarily on fundamentals, polling and lagged vote share. My models also include measures of “partisan swing,” which measures the change in partisan identification for a party between 2024 and either the year prior or the election prior, and is motivated by an underlying theory of increasing partisanship and polarization in the electorate.

National Popular Vote

My national popular vote prediction model uses LASSO to select the most influential predictors from a lengthy list of options including: incumbency measures, June presidential approval rating, percent of the country identifying as Democrat, Republican or Independent, and how those numbers changed from both the previous year and the prior election, lagged vote share, weekly polling averages in the 30 weeks leading up to the election, and economic measures of GDP, RDPI and stock prices.

The Lasso regression model minimizes the following loss function:

$$ \underset{\beta}{\text{min}}\quad \sum_{i=1}^{n} \left( y_i - \hat{y}_i \right)^2 + \lambda \sum_{j=1}^{p} |\beta_j| $$

where:

$y_i$ is the observed outcome for the $i$-th observation and $\hat{y}_i$ is the predicted value for the $i$-th observation,
$\beta$ is the coefficient for the $j$-th predictor,
and $\lambda$ dictates the strength of the Lasso penalty.

The LASSO coefficients are shown in the following table:

	Coefficient
(Intercept)	12.164
party	0.000
incumbent	0.000
incumbent_party	0.000
prev_admin	0.000
deminc	0.000
juneapp	0.000
percent	0.000
two_party_percent	0.000
ind_percent	-0.034
year_prior	0.000
year_prior_2p	0.000
swing1	0.109
swing1_2p	0.000
prior_election	0.000
prior_election_2p	0.000
swing4	0.000
swing4_2p	0.000
pv_lag1	0.072
pv_lag2	0.000
nat_weeks_left_1	0.275
nat_weeks_left_2	0.000
nat_weeks_left_3	0.000
nat_weeks_left_4	0.000
nat_weeks_left_5	0.168
nat_weeks_left_6	0.000
nat_weeks_left_7	0.000
nat_weeks_left_8	0.000
nat_weeks_left_9	0.000
nat_weeks_left_10	0.000
nat_weeks_left_11	0.040
nat_weeks_left_12	0.000
nat_weeks_left_13	0.000
nat_weeks_left_14	0.000
nat_weeks_left_15	0.000
nat_weeks_left_16	0.000
nat_weeks_left_17	0.016
nat_weeks_left_18	0.245
nat_weeks_left_19	0.000
nat_weeks_left_20	0.000
nat_weeks_left_21	0.000
nat_weeks_left_22	0.000
nat_weeks_left_23	0.000
nat_weeks_left_24	0.000
nat_weeks_left_25	0.000
nat_weeks_left_26	0.000
nat_weeks_left_27	0.000
nat_weeks_left_28	0.000
nat_weeks_left_29	0.000
nat_weeks_left_30	0.000
q2_gdp_growth	0.000
q2_rdpi_growth	0.000
GDP	0.000
RDPI	0.000
nat_unemployment	0.000
stock_adj_close	0.000

My LASSO model selected 8 predictors as important in reducing the error of my model: percent of the country identifying as independent, the change in party identification for either party from the year preceding the election, vote share from the previous election, and five different weeks of polling, including the week just prior to the election. The cross-validated R-squared value for my model is 0.88. Polling data and partisan swing have the largest coefficient sizes. An example interpretation of one of the coefficients would be: for each increase of one percentage point in a party’s identification from the year prior to the election to election year, that party’s candidate’s vote share increases by 0.1 percentage points. On the surface, that may not seem like a lot, but several “small” coefficient predictors add up.

Electoral College Vote

My electoral college model takes in less predictive variables. I make two separate models for predicting the electoral college, one for states with significant polling aggregate data on FiveThirtyEight, and one for those without. Predictive variables for both models include: state-level lagged vote share for the two prior elections, whether the candidate is a member of the incumbent party, national quarter 2 GDP growth, average state-level unemployment, the change in partisan identification for either party from the last election, and state fixed effects. For states with polling aggregates, the mean polling average and latest polling average are included in my model. My linear model uses the following equation:

$$ y_{ij} = \beta_0 + \beta_1 \text{voteshare}_{ij-1} + \beta_2 \text{voteshare}_{ij-2} + \beta_3 \text{incumbentparty}_{j} + \beta_4 \text{Q2GDP}_{j} + \beta_5 \text{unemployment}_{ij} + \beta_6 \text{partyswing}_{j} + \alpha_{i} + \epsilon_{ij} $$

Where $i$ denotes the state and $j$ the year.

## 
## Call:
## lm(formula = .outcome ~ ., data = dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.6864 -0.9452  0.0075  1.0198  6.2622 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             8.490611   1.639295   5.179 6.01e-07 ***
## latest_pollav_DEM       0.636025   0.050577  12.575  < 2e-16 ***
## mean_pollav_DEM         0.063184   0.044010   1.436 0.152856    
## D_pv_lag1               0.172542   0.033103   5.212 5.16e-07 ***
## D_pv_lag2              -0.037803   0.025319  -1.493 0.137202    
## incumbent_partyTRUE     0.249170   0.345263   0.722 0.471443    
## q2_gdp_growth           0.022438   0.015765   1.423 0.156407    
## avg_state_unemployment  0.107177   0.098937   1.083 0.280153    
## dem_perc_swing         -0.006054   0.013972  -0.433 0.665316    
## rep_perc_swing         -0.006430   0.015751  -0.408 0.683592    
## stateCalifornia         2.116318   1.055291   2.005 0.046440 *  
## stateColorado           0.350983   0.995406   0.353 0.724806    
## stateFlorida           -0.916874   1.002994  -0.914 0.361888    
## stateGeorgia           -0.183100   1.015898  -0.180 0.857175    
## stateMaine              0.015522   1.391019   0.011 0.991109    
## stateMaryland           2.896403   1.092723   2.651 0.008763 ** 
## stateMassachusetts      3.919226   1.091799   3.590 0.000429 ***
## stateMichigan           1.912411   1.052550   1.817 0.070919 .  
## stateMinnesota          1.204050   1.037450   1.161 0.247373    
## stateMissouri          -0.536897   1.035603  -0.518 0.604800    
## stateMontana           -2.704191   1.609727  -1.680 0.094740 .  
## stateNebraska          -3.878691   1.212902  -3.198 0.001640 ** 
## stateNevada             0.136722   1.312436   0.104 0.917149    
## `stateNew Hampshire`    0.192059   1.092616   0.176 0.860668    
## `stateNew Mexico`       1.043815   1.297480   0.804 0.422191    
## `stateNew York`         1.510022   1.064815   1.418 0.157918    
## `stateNorth Carolina`  -1.280866   1.035287  -1.237 0.217649    
## stateOhio              -0.673344   1.007398  -0.668 0.504749    
## statePennsylvania       1.157246   1.023277   1.131 0.259618    
## stateTexas             -0.757347   0.992960  -0.763 0.446647    
## stateVirginia           0.854729   0.999525   0.855 0.393632    
## stateWashington         1.519807   1.024802   1.483 0.139846    
## stateWisconsin          1.406889   1.020135   1.379 0.169598    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.914 on 177 degrees of freedom
## Multiple R-squared:  0.936,	Adjusted R-squared:  0.9244 
## F-statistic: 80.86 on 32 and 177 DF,  p-value: < 2.2e-16

## 
## Call:
## lm(formula = .outcome ~ ., data = dat)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.826 -2.525  0.079  2.566  9.982 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            13.584297   4.080296   3.329  0.00114 ** 
## D_pv_lag1               0.571003   0.070990   8.043  5.1e-13 ***
## D_pv_lag2               0.097555   0.064725   1.507  0.13422    
## incumbent_partyTRUE    -2.563009   0.957730  -2.676  0.00842 ** 
## q2_gdp_growth          -0.027435   0.038468  -0.713  0.47703    
## avg_state_unemployment -0.030232   0.270109  -0.112  0.91106    
## dem_perc_swing          0.004528   0.027651   0.164  0.87019    
## rep_perc_swing         -0.037591   0.031593  -1.190  0.23631    
## stateAlaska             1.341766   3.674850   0.365  0.71562    
## stateArkansas           1.928218   2.240047   0.861  0.39096    
## stateConnecticut        5.612684   2.199396   2.552  0.01189 *  
## stateDelaware           5.855018   4.019572   1.457  0.14767    
## stateHawaii             5.296471   4.428031   1.196  0.23386    
## stateIdaho             -2.990001   3.829487  -0.781  0.43637    
## stateIllinois           5.937094   2.197593   2.702  0.00784 ** 
## stateIndiana            1.359286   2.087961   0.651  0.51621    
## stateIowa               2.425217   2.302244   1.053  0.29414    
## stateKansas             0.558267   2.284349   0.244  0.80732    
## stateKentucky          -1.752716   3.708228  -0.473  0.63726    
## stateLouisiana          1.805360   2.217026   0.814  0.41698    
## stateMississippi        1.623151   2.956515   0.549  0.58396    
## `stateNew Jersey`       5.239889   2.139260   2.449  0.01566 *  
## `stateNorth Dakota`    -4.363556   3.252359  -1.342  0.18208    
## stateOklahoma          -1.981762   2.699730  -0.734  0.46426    
## stateOregon             4.564242   2.104922   2.168  0.03198 *  
## `stateRhode Island`     6.972337   3.552869   1.962  0.05188 .  
## `stateSouth Carolina`   2.140027   2.634950   0.812  0.41820    
## `stateSouth Dakota`    -3.826466   3.372609  -1.135  0.25868    
## stateTennessee          1.360200   2.093539   0.650  0.51704    
## stateUtah              -2.104533   3.056566  -0.689  0.49237    
## stateVermont            6.587487   4.348919   1.515  0.13230    
## `stateWest Virginia`    0.815409   2.297063   0.355  0.72319    
## stateWyoming           -1.648196   2.570163  -0.641  0.52249    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.716 on 128 degrees of freedom
## Multiple R-squared:  0.7859,	Adjusted R-squared:  0.7324 
## F-statistic: 14.68 on 32 and 128 DF,  p-value: < 2.2e-16

For my model with polls, the latest poll average holds significant predictive weight, as does a state’s most recent presidential election vote share. On the other hand, lagged vote share is the most important for state’s without polls, followed by incumbency. We see also in both models that state fixed effects can have very large coefficients and thus may have large predictive power, despite being less significant than other variables. My models with polls has a 0.91 cross-validated r-squared, however my model without polls only has an r-squared of 0.66. Luckily, the states without polls are not close in this election, so the precise vote share prediction is not as important as in swing states.

Predictions

National Popular Vote

Candidate	Predicted Vote Share	Lower Bound	Upper Bound
Harris	49.02198	46.90672	51.13725
Trump	48.08504	46.57114	49.59895

Electoral College

All in all, my model predicts that Harris will win the electoral college, 270-268 - an insanely tight margin.

Winner	States Won	Electors
Democrat	23	270
Republican	28	268

The following graph presents the breakdown for my swing state predictions including confidence intervals.

State	Harris Prediction	Lower	Upper	Margin	Winner
Michigan	51.1972	47.4458	54.9487	2.3945	Democrat
Pennsylvania	50.1614	46.4100	53.9128	0.3228	Democrat
Wisconsin	50.5712	46.8198	54.3226	1.1424	Democrat
Arizona	48.3335	44.5821	52.0850	-3.3329	Republican
Georgia	48.3110	44.5596	52.0624	-3.3780	Republican
Nevada	48.9965	45.2451	52.7479	-2.0070	Republican
North Carolina	47.0319	43.2805	50.7833	-5.9362	Republican

In a race as tight as 2024, it is nearly impossible to predict tight swing states and have a confidence interval that puts one candidate as a clear winner, and my predictions are no exception.

Below is the map of all my electoral college predictions and their associated vote share margins. Negative vote shares represent Trump wins and positive values represent Harris wins.

Final Post: 2024 Presidential Forecast

Alex Heuss

2024/11/03

Introductory Note

Model Breakdown

National Popular Vote

Electoral College Vote

Predictions

National Popular Vote

Electoral College