Patterns of Switching in the 2016 US Presidential Election

I am posting the slides and audio (below) here from a talk I gave at the LSE about the election, one day before it happened. Some things in here are right, some things in here are wrong. The polling estimates I was talking about were the penultimate estimates posted at so the numbers there now are a bit different. Most of the value here post-election is the estimates of which kinds of voters were switching R to D and D to R versus 2012. A proper post-mortem on the estimates overall is coming, but needs to wait for the final vote counts, which may be a few weeks. Polling methodology problems have gone back to being non-pressing, as they usually are, so we can all wait.

Brexit Referendum MRP Post-mortem

In the lead up to the UK referendum on EU membership, Doug Rivers and I posted an analysis of several weeks of YouGov polling data, using a methodology called multilevel regression and post-stratification (MRP). This is a different approach to analysing polling responses than the approach YouGov uses to analyse most of its UK polls, including those released immediately before and after the referendum on 23 June. The MRP approach, in addition to yielding several interesting findings that we discussed in that post on YouGov’s site regarding the interactions of age, educational qualifications, party and referendum vote, also aims to better correct for demographic imbalances in raw polling samples. Now that we know the results, we can say that the MRP analysis provided a better, if still not perfect, estimate of the result of the election than conventional methods used by YouGov and most other pollsters.

In our article several days before the referendum, we reported estimates of leave support at 51.0. During the day before the referendum, we re-ran the model and estimated leave at 50.1, however once we were able to include that evening’s polling responses on the morning of the referendum, our final estimates found leave at 50.6, versus the final result of 51.9. Given my stated level of confidence, I consider this a success:

One of the powerful features of the MRP approach is that we could calculate the estimates on different geographies. Our original article included a map showing the results on a map of parliamentary constituencies, here we look at the estimates at the level of the reporting areas, which were local authority districts. Below, we show the projected and actual leave vote shares on the 382 reporting districts.


Our local authority and regional estimates proved to be highly accurate. The correlation of the local authority leave share predications and the actual leave shares in the referendum was 0.92. Very few local authority results were far off of the predicted levels of leave support. 97% of the local authority predictions were within 10 percentage points of the true leave share, and 77% were within 5 percentage points. The worst predictions in either direction were off by less than 14 percentage points: Burnley (predicted 53.3% leave, actually 66.6%) and East Renfrewshire (predicted 39.4% leave, actually 25.7%).


Our turnout model aimed to replicate 2015 general election turnout patterns. This means that our turnout model underestimated turnout overall, and overestimated turnout in Scotland (highlighted in red), where turnout was very high in the 2015 general election and relatively low in the referendum. However, excluding Scotland, generally the local authorities where we expected relatively high turnout based on 2015 and their demographic composition did indeed have relatively high turnout. Overall, the turnout model did well in that it got the general patterns right and the Scotland turnout over-estimate was responsible for only a very small amount of the error in our prediction.

YouGov released an “online exit poll” at 10pm on the day of the referendum, which reported leave at 48 and remain at 52. Since that poll needed to be analysed very quickly, it could not use the MRP analysis that we are discussing here. We have now been able to go back and re-analyse that data, and when we do so, we estimate leave and remain tied at 50, halving the error in the poll as originally reported. This suggests that the MRP analysis is improving on the standard methods for adjusting polling data, but there is still work to be done.

Aside from better publicising what we were doing before the vote, we can identify two major areas for improvement in the future. First, our turnout model was based on the assumption that turnout patterns would look like the general election. This was not a bad assumption, but it did not capture the increase in turnout in the referendum, and led us to over-estimate turnout in Scotland because turnout was unusually high in the general election there and those levels were not maintained in the referendum. Second, our model used education qualifications, which were tremendously predictive of referendum vote, but not work status, social class or occupation. Given the voting patterns in the referendum, including these might have helped further close the gap between our estimates and the result, and including them might prove even more important in a UK general election.

An update on journal review debt

A couple years ago, I wrote a post on my journal review debt: the difference between the number of peer reviews I had completed and the number I had caused other political scientists to write. I am going to be an Associate Editor for the American Political Science Review starting September 1, which is going to mess up this calculation of journal review debt because it does not take into account editorial work (for which I will shortly be receiving my comeuppance). So before things get too much more complicated, here is an update of the graph from a couple years ago.


A Thought Experiment Regarding Experimental Ethics

The current controversy about a large scale experiment conducted in Montana by Stanford and Dartmouth political scientists raises several issues about research ethics in political science. To see a scan of the mailer that was sent to a random subset of Montana registered voters, follow this link.

Many of the objections I have read specifically refer to the form of the mailer. I am not going to engage with those objections here. I am interested in whether any experiment conveying the same information about candidates was necessarily objectionable, which is relevant for assessing whether the broad class of electoral field experiments are ethically problematic.

In an effort to help identify exactly what people find troubling about the study, I propose the following thought experiment. Imagine that the researchers had sent out the mailer without the Montana state seal. Imagine further that the graphical presentation of the scale positions was identical to that actually used. However, that the title and description of the information was as follows:

Public Information About Donors to Nonpartisan Candidates

Donations to candidates for public office are public information. However they are collected in large databases that are difficult for most people to access. We have been working for several years to develop simple summaries of whether candidates are supported by more liberal or more conservative donors. The upcoming 2014 nonpartisan campaign for the Montana Supreme Court has candidates who receive donations from individuals around the US. When we look at the other candidates that those donors are giving money to, we can calculate whether the donors tend to be liberal or conservative. This gives a general indication of whether the donors for a given candidate are more liberal or more conservative on average, even when the candidates themselves are nonpartisan. The average positions of donors for each of the candidates in the upcoming Supreme Court election are shown above, compared to the average positions of donors to Barack Obama and Mitt Romney in the 2012 US Presidential Election.

For more information on how these calculations were constructed, and why we think they provide reasonable assessments of the extent to which a candidate’s donors are generally more liberal or conservative, please see This guide was created…

My version of the mailer is a mechanical description of what the CFScores are and how they are calculated. It is relatively difficult to argue that this is misleading, although not impossible, as one could quibble about the “liberal” vs “conservative” labels. The final paragraph implies that one might plausibly reject the provided interpretation of the scores in that light. Assuming you find the actual experiment objectionable, do you find this hypothetical version objectionable?

For some other comments on this, see:

Chris Blattman

Thomas Leeper

Journal Review Debt

For a while, I have wondered just how many more peer reviews I have caused to be written than I have written myself. I suspect that this kind of journal review debt is more or less inevitable as an early-career scholar. So rather than write a review that is due today, I decided to go back through my records to figure this out before it became too overwhelming to do so. It ought to be easier to keep these records going forward than it was to find all those old rejections in my email.

The measures I am considering are pretty straightforward. The number of reviews I have written is the number of reviews I have written. The number of reviews I am culpable for causing to be written is the sum of the author adjusted number of reviews I have received for each of my journal submissions. I am calculating this simply as the number of reviews received for each submission divided by the number of authors on each of those submissions. By this metric, if my co-author and I receive four reviews on our joint paper, we are each only responsible for two of those. If everyone applied this metric, the number of reviews written and this author-adjusted number of reviews received would be equal, which suggests it is a sensible metric. One could also achieve this property if one instead specified contribution fractions for each author, but that seems like more effort than it is worth.

A few points on scope. I have limited my consideration to initial reviews, mostly because it was too difficult to find the R&Rs given the way I organize my files. These will be proportional for reviews written and received only if my papers are equally likely to get R&Rs as everyone else’s papers are. I decline to speculate. I have also limited my consideration to political science, which excludes papers I was involved with in other fields before I began my PhD as well as one paper on paramecium genetics since then (which is more or less negligible anyway, as it had 20 authors).


Enough methodological details. The graphs more or less speak for themselves. I ran up a large debt in grad school and immediately thereafter, when no one was asking me to review and I was getting a lot of papers rejected because I had no idea what I was doing. Over the last two years I have been slowly clawing my way back towards solvency. Linear extrapolation here is obviously insane, but if we are so inclined, it looks like I can expect to get back in the black some time in the mid 2020s. However, if enough journal editors read this blog post, I may close the gap much sooner than that.