Blog Post 1: Historical Election Data

Julien Berman

2024/09/05

In this first blog post, I will analyze and visualize historical presidential election data in order to better understand trends in the electoral college and develop a naive predictive model for the 2024 election. I will complete extension 1 as some exploratory analysis of my own.

Electoral College Timeline

Below I have produced two electoral college maps for the previous presidential elections from 1948 to 2020. The following interactive map plots the electoral college results for each election at the state level:

Notice that many states have, in recent years, consistently voted Republican (e.g. Alabama, Mississippi, Oklahoma) and many have consistently voted Democrat (e.g. California, New York, Maryland). These states are also typically won by large margins. In contrast, other states have fluctuated between parties (e.g. Pennsylvania, Wisconsin, Michigan). These are the states that could reasonably be won by either the Democrat or Republican candidate. They are typically won by small vote margins — less than three percentage points — and are called “swing” states because they contain large numbers of swing voters who lack a predisposition to a given political party.

Let’s identify the swing states for each election year. In the graph below, the lighter the color, the more hotly contested a particular state was.

We can make several important observations from both of these maps. First, note that the Democratic vote share and the Republican vote share sum exactly to 100 percent. That is because we are measuring two-party vote share, not overall vote share.

Second, many states that once voted reliably republican in the 70s and 80s have been decided by much closer margins in recent elections. North Carolina and Arizona, for example, have only recently become swing states. This is part of a broader well-document demographic shift that has put many of the sun belt states in play for the Democrats. These states have been rapidly diversifying — White voters are down double-digits since 2008 as a share of eligible voters in Arizona and North Carolina, whereas Latinos and AAPI voters are up significantly. Indeed, Gen Z is the least White generation in American history. These demographic shifts have already contributed to Democrats’ success in Senate and gubernatorial races (Lu (2024), Zingher(2018)). Currently, a majority of the Sun Belt swing-state senators are Democrats, a stark contrast to 2008 when seven of the eight were Republicans.

Third, the results of the previous election in a particular state are often fairly good predictors of the current election’s outcome in that state. In 2020, nine states have light coloring: Florida, Georgia, North Carolina, Arizona, Texas, Nevada, Wisconsin, Michigan, and Pennsylvania. Sure enough, seven of those nine states are the ones that many election forecasters are currently predicting will be the most hotly contested this time around in 2024 (The Economist, The New York Times)

Preliminary Electoral College Model

Below, I attempt to implement a preliminary model that predicts the results of the 2024 election at the state level. I use the following indicators:

The results from the Ordinary Least Squares regression above provide the following key insights:

  1. A one-unit increase in the partisan lean from the previous election is associated with a 0.403 increase in the state’s Democratic vote share, holding other variables constant. This is highly statistically significant (p < 0.001).
  2. A one-unit increase in the partisan lean from two elections prior is associated with a 0.319 increase in the state’s Democratic vote share, also highly statistically significant (p < 0.001). These first two results make sense, given that states that tend to skew more democratic than the national average are likely to have higher democratic two-party vote shares.
  3. When a Democratic candidate is from the state, the Democratic vote share increases by 3.654 points compared to a situation where neither candidate is from the state, significant at the 5% level (p = 0.017).
  4. The effect of a Democratic candidate being born in the state, while positive (2.429 points), is not statistically significant (p = 0.117).
  5. A one-unit increase in population density leads to a 0.007-point increase in the Democratic vote share, which is statistically significant (p < 0.001). This conclusion tracks with the well-documented trend that urban areas are much more likely to vote democratic than rural areas.

Of course, this is only a preliminary model. The R^2 on the in-sample data is just 0.414, which tells us that approximately 41.4% of the variance in Democratic vote share is explained by the model’s variables in the data that it was trained on. Further analysis will be conducted in future weeks, as I incorporate economic fundamentals, incumbency, and presidential approval ratings.

The above model can be used to predict the electoral college results of the 2024 election. I have loaded in a dataframe with the values for the independent variables for the current year’s data. Here is a map showing my prediction of the electoral college, which would lead to a comfortable republican victory.