Ph.D. Candidate, Department of Politics, New York University
Email: nick.beauchamp@nyu.edu
|
I am a Ph.D. candidate in Political Science at New York University, with a focus on American politics and Methodology. My current research centers around political persuasion, language, and ideology, employing automated text analysis and machine learning techniques to develop new ways of understanding the connections between speech, belief, and political action.
My dissertation develops a set of methods for understanding and predicting opinion formation and change in its natural environment: the torrent of linguistic communication surrounding all of us every day. It comprises three parts that show how text can: (1) predict ideology and voting behavior in legislatures; (2) measure, predict, and explain the persuasive effects of linguistic events like political advertisements; and (3) model the strategic arguments and opinion shifts found in political debates online or in committees.
[MORE]
Part 1 develops a general approach for estimating the ideology of legislators based on their speech. A variety of scaling techniques are developed and tested in the US and UK contexts, and the best are shown to work as well as the standard vote-based scalings, even in situations where vote-based scaling is unfeasible (eg, high-discipline legislatures).
Part 2, rather than merely estimating ideologies and validating them against existing measures, instead brings these techniques to bear on the concrete issue of persuasion in political advertising. Examining TV advertising in the 2004 US presidential election, it develops a new method for assessing the effects of hundreds of ads on vote intention; demonstrates that these effects are in part mediated by the textual content of the ads, as shown by the fact that the effects of new ads can be predicted based on their textual similarity to already measured ads; and develops text scaling methods for understanding which words and themes determine these effects. This constitutes a new and practical tool for connecting linguistic events to substantively important behaviors like voting.
Part 3 goes a step farther, embedding text analysis in a model of political persuasion. It examines the multidimensional, highly-structured, and multi-year arguments found on online political forums like dailykos.com, employing a generative model of correlated topics to classify issues, predict strategic argument responses, and model long-term opinion change. It serves as a framework for understanding not just online argumentation, but many forms of political debate, whether in advertising, conversation, Congress, or campaigns.
The project as a whole begins with text-based scaling, shows how such methods can have real-world predictive utility, and then uses those predictive powers to test models of strategic persuasion. It develops new techniques in automated text analysis and brings them to bear on substantively important empirical questions, expanding existing models of signaling and persuasion into the deeply complex world of speech.
|
Education
Ph.D., Political Science, New York University, expected 2012
Committee: Jonathan Nagler, Michael Laver, Nathaniel Beck
Dissertation: "Persuasion, Ideology, and Speech: Using automated text analysis to model opinion formation and change"
M.A., Political Science, New York University, 2007
M.A., Literature in English, Johns Hopkins University, 2001
B.A., Honors in Philosophy, Honors in English, Yale University, 1996
For more details, see my C.V.
Research
Research Interests
American Politics: Political Behavior, Campaigns, Congress, Political Psychology, Online and Social Networks
Political Methodology: Quantitative Text Analysis, Machine Learning, Bayesian Methods, Networks, Agent-based Models, Genetic Algorithms
Publications
"A Bottom-up Approach to Linguistic Persuasion in Advertising," Research Note in The Political Methodologist, Fall 2011
Nicholas Beauchamp, Henry Brady, Richard Fowles, Aviel Rubin, and Jonathan Taylor, 2004: "Findings of an independent panel on allegations of statistical evidence for fraud during the 2004 Venezuelan Presidential recall referendum," Observing the Venezuela Presidential Recall Referendum: Comprehensive Report, The Carter Center, Atlanta.
Working Papers
"Using Text to Scale Legislatures with Uninformative Voting"
ABSTRACT , PDF (under review)
Many models of legislative behavior require knowing the positions of individual legislators, but while such positions can be derived from rollcall votes when party discipline is weak, few legislatures exhibit such informative voting. This paper shows how legislators' written and spoken text can be used to scale individuals even in the absence of informative votes, by constructing two reference texts out of the aggregated speech of all members from each of two major parties. Although the popular Wordscores method can be used with this approach, this paper develops a Bayesian scaling that is more theoretically sound and which produces empirically similar results to Wordscores; this paper also develops a vector-based scaling that works better than either. The unsupervised Wordfish scaling is also tested, but is found to do worse than the supervised approaches, and no better than a quick principal component analysis of the text. These scalings are first successfully validated in the US Senate against the benchmark vote-based DW-Nominate scores, and are then tested in the UK House of Commons. In the latter case, the scalings successfully separate members of different parties, order parties correctly, match expert and rebellion-based scalings reasonably well, and work across different years and even changes in leadership. Of practical importance, removing the technical language of legislation improves results greately, as does using more extreme and out- of-power parties. Given that vote-based scalings capture little more than party and party loyalty, the text-based scaling developed here both matches what we already have fairly well, and may be a much more accurate window into the true ideological positions of political actors in legislatures and the many other domains where textual data are plentiful.
"A Bottom-up Approach to Linguistic Persuasion in Advertising"
ABSTRACT , POSTER, PDF (under review)
This paper presents a new, bottom-up approach to understanding how the linguistic contents of television advertisements determine their persuasive effects. To deal with the huge variety of political advertising, existing econometric approaches generally require first reducing this data via factor analysis, scaling, or expert categorization. However, these reductions both limit the generality of any results, and often fail to find meaningful effects, particularly when there are more than a few underlying dimensions. As an alternative to such data reduction, I develop a simple but effective one-at-a-time regression technique to estimate the effect of each unique tv ad on survey-measured vote intention during the 2004 US presidential campaign. I find that these effects were in aggregate substantively significant, particularly on the Democratic side, and particularly if the campaigns had better selected which ads to broadcast. To understand why some ads were more effective, new automated text analysis procedures are employed to scale the ads' words according to their persuasive effects. This scaling summarizes the textual characteristics of the more effective Democratic and Republican ads and shows that the two sides used very different persuasive strategies. To validate this approach, a variety of text-based techniques are developed to predict the effects on vote intention of new, unmeasured ads based only on their textual similarity to field-tested ads. This constitutes both a powerful campaign tool, and a demonstration that automated text procedures can, if properly designed, provide a wide-ranging method for inferring the effects of the ever-varying linguistic events common in persuasion.
"A Network Model of Political Argument and Opinion Change"
ABSTRACT , POSTER
Automated text analysis offers an important new technique for understanding opinion formation and change. Panel surveys or lab experiments are limited to a small subset of pre-established dimensions measured in unnatural circumstances, but a vast and growing trove of raw text documenting argument and persuasion in its natural setting offers new opportunities to understand the true dimensions of debate and predict opinion change. Many of the most important new forums for popular debate are now online, where thousands of participants debate current politics every day. This paper develops a new methodological approach to modeling political arguments by examining the largest political blog/forum, dailykos.com. Rather than model speech via topic classification techniques such as latent Dirichlet allocation (LDA) or dimensional scaling, it augments the generative topic model behind LDA in two important ways. First, existing LDA posits a set of topics, a set of speakers, and a set of documents, but assumes the topics are independent of each other. But if we see topics not as independent themes, but as the ideas, facts, and arguments underlying a debate, then it is more appropriate to model these "topics" not as independent, but as correlated via a network of support and opposition. Second, each document is generally viewed as generated independently, without regard for context. However, we know that speech in debate is not independent of what was spoken before and by whom; indeed, without this strategic element, there would be little reason to speak at all.
The model here makes two modifications to the standard topic model: first, topics may be correlated, as in Blei & Lafferty's (2007) correlated topic model (CTM). Second, the probability of a speech act (document) is now dependent in part on the speech it is responding to: in particular, a speaker is more likely to make an argument (topic) that is heavily weighted by the speaker but lightly weighted by the previous speaker. This is what an argument is: making points that the other has omitted or underweighted, points which influence not just that argument, but all relevant arguments via their network of interrelations. Such a modification of the standard topic model is essential for capturing speech that is not just expressive, but which is strategic, motivated, and which changes the opinions of the participants. The debates on dailykos.com are substantively important in their own right, and are also an excellent test bed for this approach: the arguments are threaded (so interlocutor pairs are clear) and only concern left-wing issues, so they are less likely to simplify into the standard left-right dichotomies. Furthermore, the blog employs a voting system allowing users to approve other comments, allowing an independent scaling of speech and speakers against which to test the text results. Both the vote-based scaling and the text classification find that the major topics of disagreement within the left hinge around President Obama, and to a lesser extent the media. The topic correlations allow one to predict endogenous opinion change (via a process akin to vector autoregression forecasting) better than the baseline, and which suggests an inherent tendency towards polarization. And the addition of strategic speech improves model fit over the CTM or LDA models, allowing better prediction of individual speech acts, a validation of the hypothesis that arguers speak to each other's weaknesses, and a suggestion that in this case, discussing Obama's stance on civil liberties is an effective persuasive move. Numerous other substantive conclusions can be drawn about debate on this blog, but more broadly, this method allows us to model argumentation and debate in other circumstances (such as congressional committees) with specific predictions about who says what, and how opinions may change, all in their natural, complex setting.
"Predicting and Explaining Supreme Court Decisions Using the Texts of Briefs and Oral Arguments"
ABSTRACT
This paper finds that the decisions of the Supreme Court over the last 10 years can be systematically predicted using the text of the briefs and oral arguments that precede those decisions. As well as being a potentially useful tool for parties before the Court, this approach gives insight into the tradeoffs between political and procedural decision-making on the Court. I use support vector machines (SVM) to predict decisions in two distinct ways: whether an outcome is liberal or conservative, and whether the decision favors the petitioner or respondent. Both of these approaches are able to predict decisions with about 58% accuracy, which is approximately the level of achieved by experts. When the two approaches are combined, however, the accuracy rises to 62%, well above the accuracy of experts, showing that the two models are distinct and provide separate information and insight into the Court's decision-making process. I then employ a new method of SVM-based textual scaling to show which types of issues are decided on the liberal/conservative versus the petitioner/respondent dimensions, and which words and themes characterize the briefs most likely to succeed on both dimensions. I find that, on the ideological dimension, the liberal side is more likely to win when briefs emphasize ambiguity and interpretation, while the conservative side is more likely to win when briefs emphasize definite rules and factual matters. On the procedural dimension, respondents are more likely to win when briefs emphasize precedent, while petitioners are more likely to win when briefs emphasize logical arguments. These results should be of interest both to parties potentially appearing before the Court, and to political scientists seeking to understand Supreme Court decision-making.
"How do we combine issues? Simultaneously Estimating Spatial Metrics and Utility Functions"
ABSTRACT
Most spatial models of preference assume that the spaces in question are Euclidean, and that utility functions are quadratic. Although increasing work has recently been done in estimating utility functions from empirical data, and some theoretical work has been done with non-Euclidean spatial metrics, relatively little has been done to estimate spatial metrics from empirical data in the political context. This paper employs maximum likelihood techniques to directly estimate both spatial metrics and utility functions from ANES survey data. A simulation is also conducted to confirm that these methods can indeed accurately recover spatial and utility parameters. The results show that in the most general case, the spatial metric appears close to Euclidean, but the utility function is much less "risk-adverse" than generally assumed. Furthermore, different combinations of issues produce different estimates for both spatial metrics and utility functions, although in all cases the utility functions are far from quadratic. Of practical importance, coefficients on policy variables appear to vary with different spatial metrics and utility functions, indicating that assumptions made about the metric of a space may be biasing empirical results.
Conference Presentations
"A Correlated Topic Model of Online Political Argument and Opinion Change," MPSA Annual National Conference, March 2012
"A Bottom-up Approach to Linguistic Persuasion in Advertising," APSA Annual Meeting, September 2011
"A Generative Model of Political Argumentation with Correlated Topics and Strategic Speech," poster, Society for Political Methodology Summer Conference, July 2011 *
"A Bottom-up Approach to Linguistic Persuasion in Advertising," Saint Louis Area Methods Meeting, April 2011 *
"Persuading Voters With Lots of Words: Predicting the Effects of TV Ads Using One-at-a-time Regression and Automated Text Analysis," MPSA Annual National Conference, March 2011
"How to Scale Legislatures with Text," Text as Data 2nd Annual Conference, March 2011 *
"Persuading voters with lots of words: A new technique for predicting the effects of TV ads using automated text analysis," poster, Society for Political Methodology Summer Conference, July 2010
Tools for Text conference/workshop participant; "How to Scale Legislatures with Text" on recommended reading list, June 2010 *
* Attendance funded by conference.
Teaching
Ph.D. level
TA, Math for Political Science, Jon Eguia, NYU, Fall 2008
TA, Game Theory I, Eric Dickson, NYU, Spring 2008
TA, Quantitative Research in Political Science I, Jonathan Nagler, NYU, Fall 2007
Undergraduate level
TA, Power and Politics in America, Patrick Egan, NYU, Spring 2011
Instructor, Agnes Scott College, "Politics and Fiction" 2002
For more teaching, see my C.V.
Sample syllabi on Public Opinion, and on Congress.
Sample syllabi on Bayesian Methods, and on Statistical Learning.
Work Experience
Reviewer for Political Analysis and Political Behavior
Consultant for electoral fraud analysis, The Carter Center, 2004-2005
Democracy Program intern, The Carter Center, 2003-2004
Web and Database Design, Nature Magazine, 1996-1997
For more work experience, see my C.V.
Mailing address:
New York University
Wilf Family Department of Politics
19 West 4th St, 2nd Floor
New York, NY 10012-1119
Phone: (212) 998-8500
Fax: (212) 995-4184
Email: nick.beauchamp@nyu.edu
Web: nickbeauchamp.com
Office address:
Wilf Family Department of Politics
19 West 4th St, 3rd Floor
Room 318
References
Jonathan Nagler
Department of Politics, New York University
Email: jonathan.nagler@gmail.com; Tel: +1 212 992 9676
Michael J. Laver
Department of Politics, New York University
Email: michael.laver@nyu.edu; Tel: +1 212 998 8534
Nathaniel Beck
Department of Politics, New York University
Email: nathaniel.beck@nyu.edu; Tel: +1 212 998 8535
Last updated: January 15, 2012