The dangers of analysis paralysis in voting reform

Lime

@sarawolk said in The dangers of analysis paralysis in voting reform:

@lime Among people who know about them or know about voting methods already, or among people who have never heard of them and get a simple explanation and form an intuitive opinion?

It looks like there's been a bunch of research on the topic, and the consensus seems to be that people prefer slightly more categories—up to 10, with more than 10 having smaller effects. When given the option to rate candidates on a 100-point scale, voters almost-always choose multiples of 10 (with no peaks at multiples of 20); asked to explicitly state which of several scales they liked best, they usually go with 10-point scales; and in psychometrics, the validity of self-ratings usually increases up to 9-11 categories.
https://www.rangevoting.org/RateScaleResearch.html

On the other hand, having to fit ten stars on a ballot would be a pain, so I think a +0.5 option is a better presentation. But it looks like 5 is uncomfortably few for most people.

SaraWolk

Yes, & so the obvious solution is to propose the absolute MINIMAL multicandidate voting-system.

My point was NOT that we should only promote STAR or that we should eliminate Approval for consideration. I think they are both great options that are not redundant at all.

The point was that having TOO MANY options is harmful to the adoption of any of them. How many is too many? I would (jokes aside) argue for the elimination of all that are redundant and don't add to the conversation. One traditional ballot, one ranked, and one scored system is plenty. We can eliminate any that have serious issues with vote-splitting and accuracy, that have have serious issues with voided ballots, or voter error, that have problematic strategic incentives, or that have unnecessary complexity that doesn't add much to the question.

To me that leaves us with STAR, Approval, and Ranked Robin (aka Condorcet), and the multi-winner and PR versions of those.

SaraWolk

Approval is another great system, but the fact is that voters need to approve the front-runner on their side (and everyone they like better than them). That means that a voter who doesn't like the frontrunners has to lie and approve one anyways to have an effective voice.

No, you don’t have to approve anyone that don’t like. Never approve someone that you don’t like.

Okay. You don't HAVE to, but you should. If the options are Trump, Bernie, and Biden, I don't want to approve Biden and I also don't want to bullet vote Bernie. Approval in that election doesn't get the job done. I need to be able to show that I prefer my favorite, even if I think he's an underdog. I need to be able to give my lesser evil 1 star to prevent vote-splitting without strategically supporting a candidate I dislike.

This doesn't mean that I don't support Approval, I do approve it, but this does mean that I'm not inspired to support it to the exclusion of all others. It also means I worry it doesn't have what it takes to win over reformers in sufficient numbers with sufficient enthusiasm.

The pitch for Approval, to someone excited about any other alternative voting method, is that even though they think they want to be able to show their preferences, their preferences are actually irrelevant because Approval doesn't need them to find a good enough centrist consensus winner. That's not a compelling response and it comes across as tone deaf and dismissive. Approval has been proposed over and over in Oregon. The fact that it's not the proposal isn't for lack of being included in the options.

just propose the method that's completely un-arbitrary because it's absolutely minimal: Approval.

Where to put your Approval threshold is absolutely arbitrary unless I'm a sophisticated strategic voter who knows that I should approve the frontrunner on my side and everyone I like better than them. In any case the threshold moves dramatically from election to election and from race to race. That's not actually simple for voters.

remind them that...

Please join your local League of Women Voters chapters and remind them yourself. It's not for women only and they need more people who know the facts on voting reform. This was my point. Don't be an armchair election theorist. It's a battle out there and we need you all, even if we don't agree on top choice method.

You don’t know whether you should approve your 2nd-choice? Neither do the other voters, so don’t worry about it.

Great! Thanks. I'll take your word for it. It's just our elections.

Toby Pereira

@sarawolk said in The dangers of analysis paralysis in voting reform:

To me that leaves us with STAR, Approval, and Ranked Robin (aka Condorcet), and the multi-winner and PR versions of those.

I'm not sure that these are necessarily the only options. I've said before that I don't agree that Ranked Robin is the best Condorcet method. I think Equal Vote chose it because of its simplicity. However, its simplicity as a base method simply shifts any complexity into the tie-break.

Electing the candidate with the most head-to-head wins when there is not a Condorcet winner will likely more often than not do nothing. There will probably be a three-way cycle, so three candidates on the same head-to-head wins. So it just moves to the next tie-break, which I believe is a Borda count among the remaining winners. So at this point of complexity, especially when you also consider that it's not cloneproof, I don't see it really has any advantages over Ranked Pairs, which is cloneproof.

Also for a score-based method, I'm still not convinced that STAR is the method. I said on the Election Methods list the other day that while basically all methods fail Independence of Irrelevant Alternatives (IIA), STAR seems to do so in a more wilful way. I'll just quote myself:

However, with STAR, say candidate A scores highest followed by B. B then beats A head-to-head and wins the election. But let's say that C enters the race and all the other candidates' scores remain the same. C's total score is between A and B. A then beats C in the head-to-head and wins the election. Also we can imagine that B beats both A and C head-to-head and is the Condorcet winner. In this case, STAR has decided that B need not be compared to A head-to-head because another candidate has an intermediate score. But nothing has materially changed between A and B. This is a failure of IIA caused by a decision to make it happen.

Also I've never been comfortable with STAR failing independence of clones, as I see it as a fairly "cheap" criterion to pass. From the same post:

Also on STAR's clone failure - I think Chris Benham previously talked about having an approval cut-off and having the run-off between the most approved candidate and the candidate approved on most ballots that don't approve the most approved candidate (he called it approval opposition). You could also do something similar with the scores. The run-off would be between the highest scoring candidate and the candidate with the greatest "score excess" over that candidate. To measure candidate A's score excess over candidates B, you add up the differences in score between A and B on all the ballots where A outscores B. This is arguably a simple enough change to STAR to make it cloneproof.

I do understand resistance to any changes that would make STAR more complex though obviously.

Finally about what you said about approval - I like approval partly because of its simplicity and the lack of any advantages FPTP has over it (simplicity and participation are FPTP's only "weapons" as far as I can see). But I do also agree it has problems when it comes to people not wanting to endorse a particular candidate. As I posted here:

Arguably one problem with approval voting is that people might refuse to approve the "lesser evil" of the two main candidates because they see it as a vote for and an endorsement of them. Whereas with a ranked ballot, they simply rank the candidates in order and the notion of endorsement need not come into it. Under FPTP, people will often say they could not vote for x, even if it's between x and y, and they prefer x to y. This is likely to carry over into approval voting.

Edit - and to finish, I'll requote your previous comment:

To me that leaves us with STAR, Approval, and Ranked Robin (aka Condorcet), and the multi-winner and PR versions of those.

There was a thread recently pointing out that proportional STAR isn't really STAR. Also, having reflected on the Allocated Score method that was selected by the committee, I no longer consider it to be the best option. I was involved in some of the discussions as you know and agreed to it at the time, but I've considered the matter further and I think there are much better options. I remember at the time that it wasn't to be set in stone for eternity, but I'm not sure what the criteria for review are either.

Lime

@toby-pereira said in The dangers of analysis paralysis in voting reform:

Also for a score-based method, I'm still not convinced that STAR is the method. I said on the Election Methods list the other day that while basically all methods fail Independence of Irrelevant Alternatives (IIA), STAR seems to do so in a more wilful way. I'll just quote myself:

That's kind of interesting, because I took you as saying the opposite (which is also my understanding of STAR): that STAR doesn't have to fail IIA (or clone-independence), but intentionally chooses to do so because this leads to a slightly better outcome. With STAR, the optimal strategy is for every party to run 2 candidates, which gives every voter at least two choices they can feel comfortable with.

As an example, I'd much prefer a situation where both Biden and Kamala Harris were listed separately on the ballot so I could rank Harris higher (and help her win the runoff). Right now, I'm not happy with any of the candidates in the race; on a simple left-right scale I'm close to Biden, but I disapprove of him for reasons of competence. (But I'm sure as hell not supportive of any other candidate...) With STAR, every voter should have at least two choices they consider tolerable.

Personally, I think of STAR as just reversing the primary-then-general order: we have a general election to choose the best party (the score round), and then a "primary" where we pick the best nominee by majority vote.

Lime

@toby-pereira said in The dangers of analysis paralysis in voting reform:

I was involved in some of the discussions as you know and agreed to it at the time, but I've considered the matter further and I think there are much better options. I remember at the time that it wasn't to be set in stone for eternity, but I'm not sure what the criteria for review are either.

I'll always support any kind of highest-averages system over quota-allocation. (Although as mentioned in another thread, they don't conflict if we drop the fixed-size assumption; Congress used this trick to apportion seats from 1850–1910.)

Any system that violates participation without being forced at gunpoint (by a four-way Condorcet cycle) is probably unconstitutional, since it strips some people of their voting rights (making their ballots less than worthless).

SaraWolk

@lime said in The dangers of analysis paralysis in voting reform:

the consensus seems to be that people prefer slightly more categories—up to 10, with more than 10 having smaller effects.

https://www.laguardia.edu/uploadedfiles/main_site/content/ir/docs/the-qualtrics-handbook-of-question-design.pdf

For voting having high-quality, more reliable data is really important so it's important to have the range on the lower end of the workable cognitive load range. 10 is too many. Ballot design for paper ballots also makes a field of too many bubbles a non-starter, like you say. Third, we have to remember that a voters available cognitive load should not all be all used up on the rating itself. The voter needs some bandwidth available to consider the actual candidates as well.

"Determining the number of scale points is a balancing act, which creates a tension when trying to maximize data quality. Including more scale points might differentiate responses more, whereas fewer scale points might produce more reliability. Fortunately, survey methodology research on this subject provides some guidelines for best practices that enable optimal validity and reliability. The results of this research suggest that the optimal number of scale points ranges from 5 to 9—with fewer points, you lose the ability to differentiate as much as you could between respondents, and with more scale points, the reliability of responses tends to drop off."

As we see in this thread, some people are saying that STAR is too much and that they prefer Approval for that reason. Others are saying that voters actually prefer 0-9 (citation needed). It makes a lot of sense to offer people something in the middle so we can maximize the best of both worlds.

Toby Pereira

@lime said in The dangers of analysis paralysis in voting reform:

@toby-pereira said in The dangers of analysis paralysis in voting reform:

Also for a score-based method, I'm still not convinced that STAR is the method. I said on the Election Methods list the other day that while basically all methods fail Independence of Irrelevant Alternatives (IIA), STAR seems to do so in a more wilful way. I'll just quote myself:

That's kind of interesting, because I took you as saying the opposite (which is also my understanding of STAR): that STAR doesn't have to fail IIA (or clone-independence), but intentionally chooses to do so because this leads to a slightly better outcome. With STAR, the optimal strategy is for every party to run 2 candidates, which gives every voter at least two choices they can feel comfortable with.

As an example, I'd much prefer a situation where both Biden and Kamala Harris were listed separately on the ballot so I could rank Harris higher (and help her win the runoff). Right now, I'm not happy with any of the candidates in the race; on a simple left-right scale I'm close to Biden, but I disapprove of him for reasons of competence. (But I'm sure as hell not supportive of any other candidate...) With STAR, every voter should have at least two choices they consider tolerable.

Personally, I think of STAR as just reversing the primary-then-general order: we have a general election to choose the best party (the score round), and then a "primary" where we pick the best nominee by majority vote.

I see, so you see this as a feature of STAR, not a bug? Obviously cloneproof methods mean that parties can run two candidates without them harming each other, but STAR actively encourages it, which you argue is a good thing. I hadn't actually thought about it that way, but I see your point. Essentially the run-off is just to decide within the party (reversing things as you say). I'm not sure it was the original intention of STAR, but it's worth discussing certainly.

On the other hand, people might get get annoyed if there is a candidate from another party who would have won head-to-head against the two in the run-off, but they just lost out in the scores, and therefore got cloned out of the run-off.

Also, with a better voting method (STAR or something else), elections shouldn't need to be party-dominated all the time. If there is an independent candidate (or candidate from a smaller party), they may not have someone to run alongside them, so they could be disadvantaged by this method (by not being able to block out the run-off if they are the most popular candidate).

Toby Pereira

@lime said in The dangers of analysis paralysis in voting reform:

@toby-pereira said in The dangers of analysis paralysis in voting reform:

I was involved in some of the discussions as you know and agreed to it at the time, but I've considered the matter further and I think there are much better options. I remember at the time that it wasn't to be set in stone for eternity, but I'm not sure what the criteria for review are either.

I'll always support any kind of highest-averages system over quota-allocation. (Although as mentioned in another thread, they don't conflict if we drop the fixed-size assumption; Congress used this trick to apportion seats from 1850–1910.)

Any system that violates participation without being forced at gunpoint (by a four-way Condorcet cycle) is probably unconstitutional, since it strips some people of their voting rights (making their ballots less than worthless).

It's quite hard not to violate participation. If you use a highest averages party-list system then it's easy, but it becomes harder with candidate-based systems, especially where you elect sequentially rather than all-at-once for computational reasons. Non-deterministic methods might solve that, but would they be constitutional?

Different countries obviously have different constitutions, but I presume you're talking mainly about the US constitution.

Toby Pereira

@sarawolk said in The dangers of analysis paralysis in voting reform:

@lime said in The dangers of analysis paralysis in voting reform:

the consensus seems to be that people prefer slightly more categories—up to 10, with more than 10 having smaller effects.

https://www.laguardia.edu/uploadedfiles/main_site/content/ir/docs/the-qualtrics-handbook-of-question-design.pdf

For voting having high-quality, more reliable data is really important so it's important to have the range on the lower end of the workable cognitive load range. 10 is too many. Ballot design for paper ballots also makes a field of too many bubbles a non-starter, like you say. Third, we have to remember that a voters available cognitive load should not all be all used up on the rating itself. The voter needs some bandwidth available to consider the actual candidates as well.

I'm not convinced that a voter has a set about of bandwidth that they have to share out between considering the candidates and the scores. Also if that paper says the optimum number of scores is 5 to 9, that presumably includes considering the thing and scoring it. And people will generally vote with some idea of what they are going to do. It's not the same as abstract surveys where the questions might be completely unknown to them. So I'd say bumping it up to 10 choices (so 0-9) is not completely unreasonable.

As we see in this thread, some people are saying that STAR is too much and that they prefer Approval for that reason. Others are saying that voters actually prefer 0-9 (citation needed). It makes a lot of sense to offer people something in the middle so we can maximize the best of both worlds.

I see the point, but approval has advantages for specifically being a binary thing rather than for just not having many choices. I don't think the graph of goodness has to necessarily go up and down smoothly with number of choices. E.g. I would prefer all of approval, 0-5 and 0-9 over 0-2, 0-3 or 0-4.

SaraWolk

@toby-pereira said in The dangers of analysis paralysis in voting reform:

Ranked Robin

We are planning to come back to the original intention around Ranked Robin, which is to stop branding Condorcet as a whole bunch of systems to fight between, and move to calling them one system, Ranked Robin, with a variety of "tie breaking protocols" a jurisdiction's special committee on niche election protocols could choose between. Honestly, specifying Copeland vs RP vs Minimax is way beyond the level of detail that should even be written into the election code or put to the voters.

Equal Vote's point with the Ranked Robin was never to say that Copeland is better than Ranked Pairs is better than Smith/Minimax. The point is that these are all equivalent in the vast, vast majority of scaled elections and that Condorcet as a whole is top shelf so it should be presented to voters as a better ranked ballot option. Ranked voting advocates should support it. The main reason Condorcet is not seriously considered is because of analysis paralysis and a total lack of interest in branding and marketing for simplicity and accessibility.

Lime

It's quite hard not to violate participation. If you use a highest averages party-list system then it's easy, but it becomes harder with candidate-based systems, especially where you elect sequentially rather than all-at-once for computational reasons. Non-deterministic methods might solve that, but would they be constitutional?

Different countries obviously have different constitutions, but I presume you're talking mainly about the US constitution.

The US constitution under my own (admittedly not a legal scholar, and this argument has never been tested) interpretation of past Supreme Court rulings. I'll note that the BVerfG has made the same ruling with regards to the German constitution before (nonparticipation violates the equal suffrage guarantee), so I don't think it's crazy, but it's absolutely not a standard interpretation.

I think it's sensible to accept participation failures "at gunpoint"—basically, in situations where it's completely unavoidable, unless you violate some other similarly-important criterion (like Condorcet). STAR does something like this (there's a sonewhat convoluted situation where causes the second-place candidate to be replaced by someone stronger, but this is justifiable because the new winner is arguably more popular than the old one—after all, they had a higher score).

My issue with quota methods is they violate participation more often than necessary: there are situations where there's a perfectly good participation-friendly solution, but quota methods fail to find it. (For example, in the simple party-list case.)

SaraWolk

clone-independence

There are many reasons why running clones is strongly disincentivized in general, for every voting method, regardless of passing IIA or not. Voter behavior, competing for volunteers, endorsements, and funding, etc. The statement that the best strategy in STAR is to run 2 clones per faction is absolutely wildly false in the real world, even if it might make sense in a computer model.

Lime

@sarawolk said in The dangers of analysis paralysis in voting reform:

@toby-pereira said in The dangers of analysis paralysis in voting reform:

Ranked Robin

We are planning to come back to the original intention around Ranked Robin, which is to stop branding Condorcet as a whole bunch of systems to fight between, and move to calling them one system, Ranked Robin, with a variety of "tie breaking protocols" a jurisdiction's special committee on niche election protocols could choose between. Honestly, specifying Copeland vs RP vs Minimax is way beyond the level of detail that should even be written into the election code or put to the voters.

Equal Vote's point with the Ranked Robin was never to say that Copeland is better than Ranked Pairs is better than Smith/Minimax. The point is that these are all equivalent in the vast, vast majority of scaled elections and that Condorcet as a whole is top shelf so it should be presented to voters as a better ranked ballot option. Ranked voting advocates should support it. The main reason Condorcet is not seriously considered is because of analysis paralysis and a total lack of interest in branding and marketing for simplicity and accessibility.

So then "Ranked Robin" is just supposed to refer to Condorcet methods in general?

I think that's a good strategy, but the presentation on the website made me think that Ranked Robin means Copeland//Borda specifically.

SaraWolk

@lime Yeah. Were working on that edit.

Lime

@sarawolk said in The dangers of analysis paralysis in voting reform:

There are many reasons why running clones is strongly disincentivized in general, for every voting method, regardless of passing IIA or not. Voter behavior, competing for volunteers, endorsements, and funding, etc. The statement that the best strategy in STAR is to run 2 clones per faction is absolutely wildly false in the real world, even if it might make sense in a computer model.

That's interesting. I don't think we could know without empirical evidence, but my assumption was those resources would be shared and candidates would campaign jointly with a "running mate" of sorts. If not, that makes me extremely nervous about STAR's criteria failures, which up until now I'd been assuming were a bigger problem on paper than in real life. If every candidate has a near-clone (which I assumed was the goal), STAR behaves almost the same as score. (The only difference being voters can give up a slight amount of influence over the score round to help their favorite win the runoff.)

But if we don't have clones, we could end up with a turkey-raising problem on our hands. A Gore voter might cast a strategic vote like—
Gore: 5
Bush: 0
Hitler: 4

Hoping that Hitler is polarizing enough to defeat Bush for second place with Gore's support (at which point he's a weak candidate in the runoff). But if Bush's faction thinks the same thing, Hitler can end up winning.

Candidates running in pairs makes this pointless, since you can lock up both spots in the runoff. But if that's not guaranteed, I'd be extremely concerned about STAR.

SaraWolk

proportional STAR isn't really STAR

True. It's Proportional Score. This is definitely not set in stone as the best or only way to do Proportional Score, and I expect to see more progress to determine the "best" PR method that does include a 5 star ballot and binary "runoff" of some type would be. Like Condorcet, I expect that there will ultimately be a number of viable proposals that could be considered best depending on what considerations one finds most important.

SaraWolk

Honesty is the Best Policy Peer Review Chart.jpg

A Gore voter might cast a strategic vote...
The example given isn't a "strategic" vote in any way. That would be an extremely risky vote that would be as likely to elect Hitler as it would be to help ensure your favorite won the runoff. By definition if the turkey candidate is strong enough to make the runoff then it's strong enough to be a real threat to your favorite.

Our paper found that burial is strongly disincentivized in STAR.

Constitutional Political Economy. STAR Voting, Equality of Voice, and Voter Satisfaction: Considerations for Voting Reform
https://rdcu.be/dkoyx

Lime

@sarawolk said in The dangers of analysis paralysis in voting reform:

Our paper found that burial is strongly disincentivized in STAR.

Yes, that's exactly the problem. We're talking about the same issue from two different perspectives. The issue is that burial is so strongly disincentivized that it's catastrophic. (Much like how the death penalty for littering would be very good at disincentivizing littering, but very bad for society.)

STAR punishes burial by blowing up the country, which creates a game of chicken. The mixed Nash equilibrium of chicken involves blowing up the country with some small (but positive) probability.

The example given isn't a "strategic" vote in any way. That would be an extremely risky vote that would be as likely to elect Hitler as it would be to help ensure your favorite won the runoff. By definition if the turkey candidate is strong enough to make the runoff then it's strong enough to be a real threat to your favorite.

Risky? Yes. But it's still plausibly strategic, if you think Bush will back down.

This is especially bad since it's the kind of strategy I think candidates and campaigns will try to encourage (regardless of how bad the outcomes are). Candidates coordinate strategy; voters take cues from campaigns and political elites (which is why the two major-party nominees are always the top-2 winners). If voters were individually strategic and self-interested, the low probability of a tie means nobody would vote.

The strategy I showed above would probably be bad for society or even for individual voters, because it has a good shot at backfiring and electing Hitler. However, it can be good for Gore's probability of winning, if Gore thinks Bush will back down.

Empirically, this happens all the time. Adam Schiff spent millions trying to boost the Republican in California over Katie Porter. The DNC keeps intervening in Republican primaries to try and get them to nominate extremists. They keep doing this because they think it's good for their own personal chances of winning the election, not because they think it's good for the country overall. And generally, they're right—even though it risks electing Hitlers, it still helps them win seats.