Data Science

Voting Issues:
A Brief History of Preference Aggregation

Given the surprising results of recent elections, voting methods have drawn lots of attention. Research in social choice theory reveals the underlying complexity — and flaws — of different methods of expressing preferences.

Elections have become increasingly contentious, polarizing and suspect. Over the past few decades, polling, particularly on presidential races, has suffered a series of embarrassing failures, from Donald Trump’s shocking defeat of the heavily favored Hillary Clinton in 20161 to Joe Biden’s subsequent victory over Trump in 2020. Despite a considerable margin of victory, Trump and Republicans have continued to argue that somehow the 2020 presidential election was stolen — there is no evidence to those charges — and to raise questions about future elections. The outcomes of these elections not only highlighted polling failures, but raised questions about everything from primary contests involving large numbers of candidates to the Electoral College and its proportional voting system and, more profoundly, flaws in that most intuitive foundation of most U.S. elections, the notion that the candidate who wins the most votes — also known as plurality, first-past-the-post or winner-take-all — wins the election.

In fact, the movement to create an election system that more accurately reflects voters’ often complex preferences continues to gain ground. These non-majoritarian election methodologies go by a number of different names — ranked-choice or preference voting most commonly — and feature a variety of options and features. FairVote, a nonprofit organization that advocates for electoral reform in the U.S., keeps a running tally of local and state jurisdictions that have turned to some form of ranked-choice voting (RCV).2 Maine, for instance, became the first state to adopt RCV in 2016, and used it in all state and federal primaries and congressional elections; Republican Senator Susan Collins narrowly won reelection in 2018 through RCV. New York City adopted ranked voting in 2019, and is rolling it out in primaries and general elections campaigns though 2025; the city’s new mayor, Eric Adams, won both a crowded primary and general election through ranked voting. Meanwhile dozens of cities across the U.S., and around the world (Australia, Ireland, parts of the U.K., Belgium and the Netherlands), have embraced the concept.

The critique of the New York City RCV ballot was mostly positive.3 Despite initial confusion, some 78 percent of voters said they understood how the ranked system worked “extremely or very well,” and turnout was high, up 29 percent. Less than 15 percent of voters had “inactive ballots” in the final elimination round, meaning none of their ranked choices made it that far, compared to 33 percent in the 2013 first-pass-the-post mayoral primary. Despite his victory, Adams still worried that poorer, less-educated voters would not understand how the ranking system worked.

FairVote analysts Andrew Douglas, Rob Richie and Elliot Louthen argue that a change of voting methods in the U.S. is necessary to capture more of the electorate’s complex individual preferences for candidates, particularly in campaigns with crowded fields or low-participation elections. Their analysis reveals, for instance, that on Super Tuesday 2016 — March 1, when the largest number of states held primary elections — Trump would have lost nine of 11 states instead of picking up seven if voters had submitted a ranking of candidate preferences rather than picking just one individual, as in the usual majority voting process.4 If that had occurred, Texas Senator Ted Cruz, not Trump, might well have been the leading Republican candidate for president in the subsequent election.

Of course, that was not the case, and majority rule remains the predominant U.S. voting methodology. The study of voting, or, more technically, preference aggregation, is part of a discipline known as social choice theory, which focuses on how people attempt to make optimal choices collectively.

This article examines a number of those systems and the ideas and mathematics that support them. We will explore practices that many people assume are straightforward and uncontroversial but are, in fact, complex and often flawed. More broadly, the activity known as preference aggregation has a thriving existence beyond traditional voting in political contests. It has been used across many disciplines, from economics to philosophy, and has achieved major successes in areas as seemingly different as search engine and ratings methodologies and molecular biology and genomics.

But let’s begin with elections, which lie at the heart of democratic political systems and led to the birth of social choice theory in the first place.

The Theory of Aggregating Individual Choice

Social choice theory dates to the mid-18th century, when the Marquis de Condorcet, a French philosopher and mathematician, presented his ideas on the pitfalls of making collective decisions based on individual preferences, and supported them mathematically. Condorcet was a central figure of the Enlightenment, well known and controversial for his forward-looking views on slavery and freedom; he died in prison in 1794, in the midst of the French Revolution. One of his pioneering scientific works, the “Essay on the Application of Analysis to the Probability of Majority Decisions,” provided the basis for social choice theory.5 The essay defines a way of voting in which “an alternative defeats every other by a simple majority.” A so-called Condorcet winner is defined as a candidate or issue that defeats or ties every other alternative in pairwise majority contests.6

If an election process consistently finds the Condorcet winner when it uniquely exists, then it has what’s known as the Condorcet property. However, in many cases no such collective decision emerges; no single candidate wins a majority of the pairwise contests. This is known as the Condorcet paradox. For example, consider three candidates — A, B and C — and three voters: x, y and z. If x prefers A over B, y prefers B over C and z prefers C over A, there is no Condorcet winner. The paradox arises from the fact that while individual preferences may be “transitive” (that is, if a voter prefers x over y and y over z, then we can assume x is preferred over z), the collective preference may end up as “intransitive” (x is not preferred over z). This paradox often blocks the creation of an optimal, transitive order of candidates. Another way to say this is that while individual preferences are rational, or transitive, the collective decision may be irrational, or intransitive.

Social choice research has revealed deeper difficulties in preference aggregation. One of the most important insights is attributed to economist and Nobel laureate Kenneth Arrow. In what he initially called his general possibility theorem for social welfare functions, published in 1949 (later commonly known as Arrow’s impossibility theorem), he demonstrated that no ranked voting system, in which voters rank candidates by preference, can meet criteria of fairness if voters have three or more distinct alternatives.7 The properties required to define fair voting include unrestricted domain (all preferences of all voters are taken into consideration), nondictatorship (voting cannot mirror any single voter’s preferences without considering other individuals), Pareto efficiency (no individual can be better off without making someone else worse off) and the independence of irrelevant alternatives (combined preferences for A and B depend only on individual preferences between A and B, and not on any third factor — say, C). Practically speaking, the independence of irrelevant alternatives crops up when a new candidate, such as a third-party candidate, joins a race.

Arrow’s theorem directly questioned the ultimate fairness of democratic elections.

University of Michigan philosopher Allan Gibbard went on to generalize Arrow’s ranked model to include cardinal preferences, meaning that voters can not only assign a ranking of preferences but can quantify differences among their choices by assigning grades to candidates. Gibbard also includes nondeterministic preference aggregation functions that introduce chance in determining social choice (in practice, some votes are excluded randomly).8 Under such conditions, Gibbard’s theorem states that any process of collective decision making either ends up being dictatorial, limits possible outcomes to two options or encourages agents to act strategically — that is, submit preferences that don’t reflect their true opinion but are made based on expectations of how others may be voting. Individuals may vote not because they like their candidate but because they dislike another candidate more and wish to defeat them. That is certainly the case in an election with two relatively unpopular candidates, as in the 2016 U.S. presidential race, or in a field of multiple candidates, such as primaries.

Ranked Voting Systems

One way to counteract Arrow’s impossibility theorem is to formulate voting rules that relax at least one of his fairness criteria. Usually, that’s the independence of irrelevant alternatives (IIA) axiom. IIA suggests that replacing an A > B > C vote with an A > C > B cannot change the preference order between A and B. But given that A > C > B suggests a stronger preference of A over B than A > B > C, preserving IIA is not always ideal.

During Borda’s life, the French Academy of Sciences used his method to elect its members. However, after Borda’s death in 1799, Napoleon Bonaparte became president of the academy and replaced the Borda count with his own method. Nevertheless, it is still used in academic institutions and political jurisdictions (for example, the Slovenian Parliament) to distribute minority seats, while the Downdall system is used in the Pacific island nation of Nauru.9

Another increasingly popular preferential voting method is the single transferable vote (STV), which is designed to achieve proportional representation in a multiseat contest. Voters list their preferences from a slate of candidates. Votes are totaled, and a quota is derived for the number of first choices needed to win a seat. The most common quota requires 50 percent plus one vote and is known as the Droop quota: (total number of valid votes)/(number of seats to win + 1) + 1. Candidates who hit the limit are elected, and their surplus votes over what was required to win are distributed to voters’ second choices, pushing more candidates past the quota. If more candidates than seats remain, the candidate with the lowest number of top votes is eliminated and their top votes are distributed to the second choices. The process continues until every vacant seat is filled.

STV is usually referred to as instant runoff voting (IRV) when it is used to elect a single alternative. In this case, the lowest first-choice candidates are successively eliminated (and their votes redistributed) until the field is reduced to two. The final round becomes an instant runoff because the top two go head-to-head. We can use the order of the eliminated candidates to obtain an aggregated choice. A similar but nonpreferential multiround system, called the exhaustive ballot, eliminates alternatives with the fewest number of votes and asks for a revote if neither of the remaining alternatives achieves an absolute majority. In practice, both STV and IRV are extremely difficult to manipulate from the outside, so they are considered to be a great improvement over majority rule.10 The world soccer federation, FIFA, and the International Olympic Committee use the exhaustive ballot to select host cities, while the Academy Awards and Australian federal elections employ the IRV.

The FairVote analysis discussed earlier, which concluded that the 2016 presidential election might have had a different outcome with a different voting system, also applies the IRV method. Figure 1 summarizes the results of the simulation based on polls for the 2016 Republican primary election in Georgia. We can clearly see the benefits of using full rankings, which contain much more information than a winner-take-all voting system. In this example, Trump led by far in the first round and thus won by majority rule. However, by taking into account all the preferences for the remaining candidates, consecutively eliminating the least-preferred options and recalculating the rankings on the reduced subsets, ranked-choice voting systems would have arrived at a different result because voters for Cruz and “all other candidates” would have preferred Rubio to Trump.

Still, neither majority rule nor preference ranking is perfect; neither consistently produces a Condorcet winner. However, this can be remedied by a more complex rank aggregation system called the Kemeny rule.

Rank Aggregation with the Kemeny Rule

John Kemeny developed a Condorcet-consistent voting method in 1959. Kemeny was a Hungarian-American mathematician and computer scientist and is most famous as a co-developer of the BASIC programming language; he also served as president of Dartmouth College. Kemeny’s rule selects (out of n! possible permutations) the final ranking of preferences that minimizes the sum of the number of pairwise disagreements of individual preference orders, creating what’s known as a Kemeny consensus. Creating this consensus ranking involves calculating the number of steps a bubble sort — an algorithm that sweeps through rankings and counts the number of swaps to make two lists equivalent — would require to transform each ranking to the aggregated one. This results in what is known as bubble-sort or Kendall tau distance (after British statistician Maurice Kendall, who developed the metric in 1938); the larger the distance, the more dissimilar the two rankings, and vice versa. There is another intuitive probabilistic interpretation that can be drawn from the Kemeny rule.11 Suppose voter preferences are noisy versions of an underlying true ordering obtained by swapping alternatives with a probability less than 0.5. Then the Kemeny aggregation is the most likely true ordering, which produces the preferences.

Despite its superior properties, Kemeny aggregation suffers from a serious problem in practice. In computer science terms, it is NP-hard, even with only four voters;12 however, numerous fast approximation algorithms have been developed that help this situation.13 Moreover, Kemeny’s time complexity is linear in the number of voters, hence it remains computationally feasible for up to ten candidates. Another practical workaround is to use the so-called local Kemeny property — a ranking whose Kendall tau distance cannot be improved by a single swap of neighboring alternatives.14

The task of creating an accurate consensus ranking is not only a problem in politics. There are many situations where individual “judges” focus on different evaluation criteria. In this context, the aim of rank aggregation is to gather the knowledge and produce a best possible final ranking. The following examples illustrate the successful application of social choice theory in rank aggregation problems in various other fields.

Applications beyond Politics

Google is the most famous, and the most popular, search engine currently operating. However, there are dozens of other publicly available specific and general-purpose query-based search engines.15 Metasearch is the task of aggregating different search rankings. One of the field’s most important goals is to combat “spam.” The word comes from a skit from Monty Python’s Flying Circus in which a waitress reads off a menu that turns out to contain only Spam, a brand of canned cooked pork, to a customer trying to avoid it. A web page is spam if it achieves a higher ranking in search results than it deserves based on its content and link structure.16 Spam typically contains hidden text — out-of-scope links specifically designed to achieve a top ranking for a wide range of possible queries.

There is a way to effectively deal with spam. The extended Condorcet criterion (ECC) states that if you can partition alternatives into two sets (X,Y), so that for each x in X and y in Y the majority prefers x over y, then each x is ranked above each y in the aggregate.17 (When X has only one element, we arrive at the Condorcet condition.) By definition, spam targets the index of specific search engines. If a few search engines give a page a high ranking, ECC will ensure that the page rank drops in the metasearch result as the majority shows greater preference for honest pages. It turns out that methods with a Kendall tau distance that cannot be increased by a single transposition of two neighboring elements (local Kemeny property) are ECC consistent.18 A local Kemeny optimization can be performed on any ranking — for example, a Borda count — by starting from the top alternative and recursively adding the next preferred candidate and bubble the newly added element up until the majority of voters agree with that.

Link prediction is another field where social choice theory has been successfully applied. Link prediction observes a network and tries to predict future connections. Successful link prediction algorithms help build better recommendation systems for buying products or services; predict outbreaks of diseases and crowds in transportation networks; and suggest new friends, jobs and partners on social media sites. The elementary predictors are the topological features, which express the properties and relationships among the nodes. They usually include the number of common neighbors, the shortest path between nodes, the ratio of common neighbors to the union of neighbors, known as Jaccard’s coefficient, or the PageRank of the nodes (PageRank is Google’s search engine algorithm for ranking pages). Intuitively, if two nodes have many common neighbors, they will more likely link in the future. Better results are achieved if the whole feature set is used simultaneously. A major trend is to use supervised classification models on the features to predict the linking of two nodes. This model can be trained on the historical evolution of a partial or full graph.

An alternative approach is to create a rank, which represents the order of linking probability derived from a feature, and aggregate the ranks in line with social choice theory. The predictive power of different features can vary a lot, so it’s preferable to have a voter-weighted version of the Borda count or Kemeny rule. Such voting rules have been found to more accurately predict future connections in the DBLP computer science bibliography than classical solutions such as k-NN, Bayesian methods and decision trees.19 According to other studies, the approximated Kemeny rule that aggregates features like degree centrality or PageRank also outperforms standard supervised learning techniques when it is used to detect outbreaks of viral influence on Twitter.20

Rank aggregation also has proved useful in computational biology, where a recent area of research has focused on deciphering so-called microRNA targets. Andrew Fire and Craig Mello received the Nobel Prize for medicine in 2006 for RNA inference.21 This process silences gene expression: Small noncoding RNAs, called microRNAs, have crucial importance in this process by blocking messenger RNA from conveying genetic information from the genome to the ribosome; microRNA appears to be part of an immune response. In drug development, the long-term aim is to cure diseases such as cancer and HIV by shutting down genes. Experimental verification of microRNA targets is not an easy task, as each microRNA may affect a large number of genes. Many different target prediction algorithms have been published, greatly disagreeing on the target gene ranking. A reformulated Kemeny algorithm that tries to equally distribute the total number of disagreements on the predictors has been effective in aggregating target predictions.22

Finally, social choice theory naturally arises in the context of markets. By design, prediction markets, such as horse racing, pay for successful wagers on the correct order of finish. Gamblers seeking an edge can combine ranked elementary predictors, such as a horse’s track record, with voting functions. This method works best when the predicted order carries most of the information, because differences among the horses cannot be quantified without a lot of noise. In this case, information loss due to ordinal rankings is recouped by the robustness of the ranked elementary predictors and by the superior aggregation method. Voting rules will especially flourish when magnitude information is not available. Financial analyst recommendations are available from a number of services, such as the Institutional Brokers’ Estimate System (IBES). The service archives recommendations on more than 40,000 companies by hundreds of analysts and makes social choice theory a useful tool for developing investment signals.

Political Implications

Given their success in a wide range of areas, why haven’t these preference aggregation methods been adopted more broadly in political elections? After all, majority rule has drawbacks. The spoiler effect presents a scenario in which a less popular candidate draws votes from a major candidate with similar positions, allowing smaller parties to change the outcome. The system is also susceptible to gerrymandering, the practice of gaining political advantage by manipulating the shape of congressional districts. As Gibbard points out, strategic manipulation in most voting systems is unavoidable; however, majority voting is extremely susceptible to reporting fake preferences to gain better outcomes. According to Duverger’s law, attributed to French sociologist Maurice Duverger, the first-past-the-post election rule favors the emergence of a two-party system.

To combat some of these disadvantages, a number of countries have implemented instant runoff voting. For instance, Australia uses such a system in general elections, Ireland for presidential elections, the U.K. and Canada to elect major party leaders and the state of Maine to elect U.S. senators and representatives. However, voters are often resistant to changing age-old practices, particularly when it involves a more complex substitute. In 2011, the U.K. proposed the alternative vote referendum to replace its plurality system with IRV in most elections. Two-thirds of the population rejected it, with the majority backing the referendum in only ten of 440 local voting areas: Cambridge, Oxford and eight others. The Kemeny rule has seen no practical use because of its computationally infeasible algorithm.

In fact, we are yet to see a society that follows the metasearch engines and uses local Kemenization applied to IRV results to ensure that elections are Condorcet-consistent (with extremist parties falling in rank like spam in a search engine).

Conclusion

All voting systems have flaws that produce practical problems, yet democracy remains the most accepted way to govern individuals around the globe. As Winston Churchill famously said, “Indeed, it has been said that democracy is the worst form of government except all those other forms that have been tried from time to time.”

Still, a healthy democracy requires credible voting systems that are viewed as fair and reflect as much information as possible about individuals’ preferences. As a starting point, asking voters to submit an ordered list of preferences for candidates may be a big step toward an optimal solution. Majority rule will continue to produce unexpected results, even outcomes not backed by a majority, but attempts to improve voting systems increase year by year.

The results of both the 2016 and the 2020 U.S. presidential election highlight the flaws of majority voting and reveal a profound need for a change in the system. According to Atlantic writer Russell Berman, the 2016 “election pitted the two most disliked candidates in the history of public polling against each other. In the race between Donald Trump and Hillary Clinton, millions of Americans found themselves forced to vote for a major-party nominee they plainly couldn’t stand or to risk electing the candidate they hated even more by casting their ballot for a third-party contender.”23 Although Berman backed up his argument with public opinion data on candidates since 1980, the outcome of the 2016 race cannot be attributed solely to that. As we’ve seen, majority voting can have such an effect if individual voter preferences for all candidates are disregarded — a situation made more severe as the number of candidates grows.

But there’s good news. Election systems are improving slowly but steadily, as the FairVote tally suggests, which may lead to better social choices that reflect preferences more accurately. Fears of disenfranchisement or voter suppression through relatively complex voting methods remain an issue, and as a study of the 2018 Maine elections suggests, ranked-choice voting does place an administrative burden on county clerks — particularly GOP clerks — who oversee elections. (Partisanship even extends to county clerks.) But practice, among both voters and clerks, may smooth some of the difficulties. And, of course, the preference-discovery mechanism of elections remains a key to democracy. As Abraham Lincoln said in 1856, in a quote that may be just as relevant today as it was then, “Do not mistake that the ballot is stronger than the bullet.”

 

Marton Farkas is a Senior Regional Research Director at WorldQuant and has a master’s degree in mathematics from the University of Cambridge.

Dusan Timotity is a Vice President in the Value Addition Group at WorldQuant and has a PhD in financial modeling from Budapest University of Technology and Economics.