STAR vs. Score

Jack Waugh

I think all the replies to date on this topic are from people who either, like me, have skepticism about STAR, or who outright oppose it on various grounds. The STAR advocates haven't yet spoken up. I want one of them to present an example where the results differ and say that they think STAR produced better voter satisfaction overall than Score would have. I expect I will be able to counter that when they worked their example, they did not use the optimal strategy with Score. So the end product I expect to happen from such exchanges is a lack of cases that count for STAR. If the outcomes aren't any better, there is no justification for the extra complexity.

Jack Waugh

STAR is proposed as an "IRV 2.0." In response I propose an "IRV 3.0," which has more to do with IRV, because it accepts votes in IRV style, which STAR does not.

cfrank

@Keith Yes that is true, although I don't really understand why people would be content with tyranny by the majority when it may be possible to avoid it (unless they are a part of the majority). I'm actually not much of a fan of utilitarianism, I am in favor of something that is distributionally just. STLR voting is interesting.

@Jack-Waugh, in my opinion STAR is way better than IRV. I'm not exactly an advocate of STAR, but I think in general that rank-order voting is probably not going to be a great solution. I think that "independent" scores is really the best I think we can hope for for now, and that means it's what we do with those scores and how that interacts with voters' decision-making that's important.

Keith Edmonds

@Jack-Waugh What STAR does is it renormalizes everybodies vote weight to give them the same impact. This is an attempt to reduce the amount of strategy needed. I do not think that it would outperform somebody who used optimal strategy with score. The point is that most people do not or cannot use optimal strategy. STAR then puts people bad at strategy on a closer level to those who are good at strategy. So I do not think you are wrong in what you say. If all people where fully informed, rational and strategic then score would likely be better. However, people are not any of those things in general. I do not think your like of argument will hold up under this consideration.

An example of where score produces a better outcome than score is

40% = A:5 B:0 C:0
31% = A:0 B:5 C:1
29% = A:0 B:1 C:5

Score give A and STAR gives B. This is an engineered and somewhat extreme example to illustrate the issue. Is 5 infinitely more than 0 or just 5. Is 5 weighted as 4 more than 1 or 5 times. There is no universal metric and different people will choose different metrics. STAR normalizes it all away and compares the two most favoured with full weight to each voter.

STAR is a simplified version of Baldwin's Method. When you think about it that way you see the intent.

Essenzia

@Keith
Given these 3 types of ratings (assuming they are the ratings of the 2 frontrunners, after eliminating all the others):
[0,1] - [2,3] - [4,5]
STLR normalizes them like this:
[0,5] - [3.33,5] - [4,5]
Baldwin normalizes them like this:
[0,5] - [0,5] - [0,5]

For me, STLR uses better normalization but I don't think it's the best.
If a vote like this: [4,5] remain the same in the clash between the two finalists, the voter from the start will be encouraged to downplay the rating of the worst candidate of the 2 (i.e., to vote like this from the start [0,5] ).
I prefer this normalization in clash between two finalists:

if you have a couple [0,0] or [5,5] the vote is irrelevant.
if one of the two candidates has a score of 5, the other is put at 0.
if one of the two candidates has a score of 0, the other is put at 5.
if both candidates have intermediate scores, then STLR normalization applies.

For simplicity, I call START the STAR that uses this normalization.
In this way, at the beginning the voter:

first assigns 5 to his most favorite candidates and 0 to the most hated ones.
then he can feel freer in assigning intermediate scores.

Such normalization is proposed indirectly in Tragni's method, although in that context it is used to make comparisons between couples.

SaraWolk

STAR Voting is designed to maximize both utilitarianism and finding majority supported winners where possible. I look at it like a debate between quality and quantity. Both are important. In STAR Voting the scoring round measures quality of support, (how much do the voters like the various candidates. Then, the runoff measures quantity or number of supporters, (between the two front-runners, which do you prefer.)

As for why I believe STAR Voting is more fair and representative compared to Score, of course I have to start with the disclaimer that Score is a very good system, and they get the same winner most of the time, but in Score Voting if I vote honestly and don't give any front-runners a top score, then my vote's impact is less than if I had strategically given my lesser-evil a top score.

Another shortcoming with Score that is addressed by STAR is if some voters fail to use the full scale. Strategically speaking, the best strategy in both methods is always to give your favorite 5 stars, but it's to be expected that some voters will give a mediocre score to a mediocre favorite, especially if they are new to the system. Unless you normalize scores, Score voting gives voters a chance at an equally weighted vote, but doesn't actually guarantee it. Hopefully this will be corrected with voter education and good instructions, but with STAR there's the added failsafe that the runoff is binary. Ultimately your vote is just as powerful as everyone else's.

These voters, voters who are currently marginalized in our current system, should have just as powerful a vote as a voter who does support a frontrunner. Guaranteed. STAR Voting does that. If your favorite can't win, you can give your favorite 5 stars, give your lesser evil 1 star, and that will still ensure that if it comes down to it, your fully weighted vote will help prevent your worst case scenario. That's how STAR Voting prevents tyranny of the Majority. If your lesser-evil is actually substantially better than your worst case scenario you can give them a better score.

As far as I know STAR may be the only method where even if none of the candidates you like can win, your vote can still make a difference and help prevent your worst case scenario.

SaraWolk

@Keith Exactly. And the intent is not only to reduce the need for strategic voting, but to actually incentivize honest voting, and to ensure that the system is fair and equal. This is the key to eliminating an "electability" bias, or status quo glass ceiling. There are a lot of reasons why and it's not just about any one of these reasons in isolation. I see it as a very empowering voting method overall.

For voters who don't like the frontrunners, their vote is still as powerful as a voter who does have a strong candidate on their side. The full repercussions of this are hard to quantify, but this is one reason that I think STAR is the most powerful single winner voting method to break two party domination.

Jack Waugh

@cfrank said in STAR vs. Score:

in my opinion STAR is way better than IRV.

No question.

Jack Waugh

@SaraWolk Nevertheless, you have not as yet provided an example, starting with voter desire, and leading to different outcomes between STAR and Score.

Jack Waugh

@SaraWolk said in STAR vs. Score:

Strategically speaking, the best strategy in both methods is always to give your favorite 5 stars

Not according to STAR and the Nader problem.

And I suppose that if STAR can exhibit this kind of behavior, so can cardinal Baldwin. I see STAR as fundamentally, abbreviated cardinal Baldwin. The abbreviation is achieved by combining the rounds of tallying except the last one.

SaraWolk

STAR does not pass FB criterion, so yes, there is a hypothetical scenario possible where giving your favorite less than 5 could be beneficial, but that does not mean that there's a real election scenario where that's actionable or incentivised in real time with a realistic amount of information on voter behavior available.

The mark of an ideal system is to balance competing considerations and incentives to give something that's robust all around. Score in most cases will get the same outcomes, and so I personally don't think that accuracy is the principle to look at to differentiate between them. The biggest difference is in terms of real world advocacy. Score is a dealbreaker because vote weight isn't normalized. We get attacks on STAR regularly that are not true about STAR, but that are about Score.

Strategic voting aside look at this example:
Voter Vicki is a disenfranchised voter who typically doesn't like the frontrunners in her city, which amazingly uses Score voting to elect the mayor. In this race the frontrunners are named Bad and Worse, and there are a few other options as well. She gets her ballot and fills it out honestly like so:
Bad: 1
Worse: 0
Boring: 2
Lame: 3
Obscure: 5
Because Vicki really dislikes both frontrunners, her vote is predictably less powerful than someone who actually does like one of the frontrunners and dislikes the other. Vicki's vote is thus dependably less powerful than other voters and she remains marginalized. In contrast, STAR Voting guarantees Vicki an equal and fully powerful vote for the finalist she prefers.

Sure, some people could argue that since her strength of preference is weaker it's fair that her vote cary less weight, but most would disagree.

PS. Cardinal Baldwin isn't monotonic. The extra rounds and drawn out process make a difference, so they really aren't the same systems. Just similar.

Jack Waugh

@SaraWolk said in STAR vs. Score:

some people could argue that since her strength of preference is weaker it's fair that her vote carry less weight

No, of course not. Everyone deserves the same weight. But in Score, she has to vote Bad 4 or 5.

Multiround tallying systems are confusing. They produce results that belie the expectation that all balanced systems would behave identically. And for me this expectation came from the logic that if two systems behave differently, at least one of them must be cheating some voter out of some of her rightful power, which contradicts the assumption that both systems are balanced.

I guess some of your points are:

the decision between the top two may matter more than the decision between a random two. So STAR makes sure everyone has full strength in that decision even if they vote their desires without regard to any estimate of where the other voters stand.
STAR performs much better than Score when voters vote that way.
STAR makes it difficult to find a better performing strategy, even though theoretically, one exists. The signal--to-noise ratio for finding it is prohibitively low.

Let's add to the candidate field of your example, Bad II, a clone of Bad, and Worse II, a clone of Worse. Of course, all the voters are aware this has happened, so can adjust their strategies. Since the four bad and worse candidates are the front runners, unless there is a significant upset (difference between perception of where the voters stand and where they turn out to actually stand), the finalists will be Bad and Bad II or Worse and Worse II. How should Vicki vote?

SaraWolk

@Jack-Waugh
Bad1: 1 star
Bad2: 1 star
Worse1: 0 stars
Worse2: 0 stars

I disagree that the fact that score and STAR don't produce identical results means that one or the other is cheating. Neither is cheating voters. They are both good methods and the they optimize for slightly different things.

STAR optimizes for both strength of support and number of supporters.
Score optimizes for strength of support specifically.

Both are valid goals and methods, but there's a real world benefit to narrowing down the list of proposals to help lay people make a good choice. If we promote both loudly (and also list all other good methods we can think of) the considerations would be overwhelming to most and would lead most to get overwhelmed and quit researching, or worse, come to a decision after only considering a one-sided set of considerations.

Take Condorcet for example. Condorcet has largely failed to get adopted anywhere because of lack of consensus around the best version, despite that all versions are quite a bit better than most methods in use. If Condorcet advocates had come together around a good well rounded proposal and simplified their pitch a long time ago it would likely be the dominant RCV method, but no, they focused on academic debate over cohesive advocacy. We cardinal advocates should take note. Are we debating because we want better democracy in the real world, or because we find the question interesting and enjoy the debate for its own sake?

Jack Waugh

@SaraWolk, I think that if the conditions here are that 99% of the electorate considers the race as being between the Bad party and the Worse party, and they aren't even taking Obscure into consideration as a possible winner, Vicky must give Bad[a] and Bad[b] scores of at least 4 so as to exert sufficient pressure to do her part toward preventing Worse[a] and Worse[b] from being the finalists. The existence of the clones reduces STAR to Score and so the situation demands the same strategy as would be appropriate for Score.

As to why pose questions and try to answer them, it's because I need to learn what is going on with these systems, to try to prevent being pulled into error again.

I am involved with a little political group that thinks it is drafting platform planks for a national-level party. I joined it for the sole purpose of trying to prevent error in its stance on voting systems. I would be concerned for the rhetorical effect on State-level parties if a national-level party publishes a severely misleading stance in this regard. The first draft of a platform on which this group bases its work (from another group) requires a ranking voting system in all cases. I feel that allowing that stance to stand would be severely misleading, because choosing ranking eliminates rating, and there are grounds to judge that several rating systems are more democratic than even the most democratic ranking systems. And especially more so than IRV, which is in practice what people mean when they call for ranked-choice voting.

I believe that of the people involved in the group, I have by far the most knowledge on single-winner voting systems. I think most of the group either don't care, or think I am the one who has the deepest understanding. Of course, they don't think I am infallible, and I have taken care to present myself as fallible. I said, I am not God and my opinions might not be correct, but, I keep saying, I can present arguments to support them. Interest in the details of these arguments has been slight to nonexistent. But I have been asked questions about what opinions I have, going outside of those I initially stated when approaching the other members of this group on this subject. For example, I have been asked whether I think IRV is better than FPtP.

I have been telling this group that STAR is at least as democratic as Score. I don't want egg all over my face from finding out later that it is false.

I supported IRV for years because it made intuitive sense to think that it gives third parties and independents a chance. After all, it tallies in rounds, and in an early round, you get a chance to support, effectively, your favorite candidate, and if that effort fails, you get a say in the final round as between the bad and the worse. It's very strongly intuitively attractive. It took discussion and argument and deeper study to see that my intuition was simply not correct. Intuition in general is not guaranteed to amount to a correct understanding of the facts in all cases, and neither is "common sense." Sometimes I think common sense is correct 80% of the time, and sometimes I think it is so only 20% of the time. Intuition and common sense are heuristics, mental shortcuts, useful for making emergency decisions when we do not have time for study in depth.

Recognizing that there are reasons for seeking deeper and more nearly rigorous understanding, I nevertheless encounter an effective obstacle in that I am neither practiced nor talented in math. I think for people who are, their intuition more closely matches the reality, quite as how people who are good at chess can assess a position. If my level of familiarity with math matched that of Turing, and Euler, and Ramanujan, and von Neumann and the uncredited females he stole ideas from, and Curry, and Amy Noether, I could probably work this out by myself. But I'm not at that level, and so tend to ask for help.

When you or I or any of the readers asserts that a single-winner voting system gives equal power to the voters, one voter to another, the correctness or incorrectness of that assertion turns on the matter of who is selected as the winner by that system.

Suppose groups of us are engaged in a literal tug of war. But rather than a single rope, there is a hub device and several ropes attached to it, leading to the groups of people who are going to pull on it. The hub device and the ropes are free to move over the ground. A circle is drawn in the grass, and the hub device placed at the center of that. Every group picks up their respective rope and pulls on it. The hub device will stay in the center if our forces balance to zero. Otherwise, it will be pulled toward some point on the circle. If our forces, person for person, are equal, surely only one outcome is possible. How can a contest go two different ways without changing the relative power of the participants? This point still confuses me.

At this point, I do not have a complete mathematical definition of voting equality. The closest thing I have to it is a pair of conditions that I argue are necessary. I give provisional credence to the idea that these conditions may also be sufficient, simply for lack, for now, of clear evidence to the contrary.

First condition: Frohnmayer balance. If one voter can move the needle, another voter must be able to move it back.
Second condition: best known freedom of expression. The best known is that shared by Score/STAR/Approval. Counterexamples that still meet the first condition include Borda count (requiring ranking all candidates), "vote for and against", and "vote for or against."

Clearly, Score and STAR meet both of these conditions.

The first condition is directly related to the final result, as it is defined in terms thereof. Being able to move the needle is defined in terms of effect on the final result under certain conditions, which can happen.

The second condition is indirectly related to the final result. The argument goes that if a system balances the power of voters whose honest stances or even strategic stances match votes that the system allows them to cast, but if there are other voters whose stances do not have corresponding votes that the system allows them to cast, they are being cheated because they are being partially muzzled. Clearly a system that allows them votes corresponding to their stances, and takes those votes fully into account in the tally, is giving them more power than a system that gives them a Sophie's Choice of votes that do not so precisely correspond to their stances as to the possible stances of other voters.

But anyway I'm still left confused about whether equality implies a unique result. Intuitively, it should.

rob

Here is an example where STAR produces a different result than Score on a Nader scenario, assuming that:

Gore and Bush are the two front runners,
it is very close between Gore and Bush
most voters that like Nader best, prefer Gore over Bush.

These are pretty reasonable assumptions based on the 2000 election. (right?)

Nader voters who attempted to best express their preferences might vote Nader: 5, Gore: 3, Bush: 0. Under Score, lots of people voting this way, rather than giving Gore a 5, could cause Gore to lose. But giving Gore a 5 disallows that voter from expressing their preference for Nader over Gore.

In STAR, they could express that preference without handing the election to Bush (their least favorite), since Gore and Bush end up being the two front runners, and 3 vs 0 counts as much as 5 vs 0 in the second round. In fact, no matter which two are the front runners, they have expressed their vote in the most effective (i.e. strategic) way.

The big problem with Score in this sort of scenario is that it can help entrench the 2 party system, since a 3rd party candidate like Nader would be discouraged from running (unless he runs under one of the major parties), since he can hurt those that like him, by causing their least liked candidate to win. That is, he has still split the vote, albeit not as strongly as under FPTP.

So yes, you could get different results under Score vs. STAR in that scenario, especially if you assume that not all voters are 100% sure who the front runners will be. (i.e. the more likely people are to wrongly guess that Nader might be a front runner, the more likely it would be for them to rate Gore lower than Nader)

In a 3 person race, I think STAR does really well. I have my doubts when it gets to be more than 3, which is why I'd prefer a method that selected the Condorcet candidate if one exists, and only hold that second round if there is no Condorcet winner.

Jack Waugh

@rob said in STAR vs. Score:

Nader voters who attempted to best express their preferences might vote Nader: 5, Gore: 3, Bush: 0.

But I'm pretty sure that's not the optimal strategy for Score. I think that many of them should vote Gore 5, and a few should vote Gore 4. They don't have to coordinate, to achieve that kind of a mix. If each individual dithers mentally between 5 and 4, the result can be random, so with everyone's random behavior, relative frequency follows probability.

Jack Waugh

As @Sass has pointed out to me, in some States of the US (I don't know about the provinces in other countries), there is a legal reason to promote STAR rather than Score. Those States have a (constitutional?) requirement that elections be decided by majority. STAR (like IRV) manufactures a fake "majority", which may pass muster in the courts, where Score would not.

rob

@Jack-Waugh said in STAR vs. Score:

But I'm pretty sure that's not the optimal strategy for Score. I think that many of them should vote Gore 5, and a few should vote Gore 4. They don't have to coordinate, to achieve that kind of a mix. If each individual dithers mentally between 5 and 4, the result can be random, so with everyone's random behavior, relative frequency follows probability.

So, even if they strongly prefer Nader to Gore, they should strategically say otherwise on their ballots, because they suspect only Bush and Gore will be front runners?

I don't see that as a positive. Score supposedly encourages voters to express their true preferences. If doing so is not good strategy, that is a failure, in my opinion. This is exactly why STAR does what it does.

What you are actually suggesting is that people vote in Score as if it is FPTP... attempt to guess who will be the front runners, and give your full vote power to your favorite of the two of them. Yuk.

Voting the way you suggest is strategic relies on voters knowing who is likely to be a front runner. This is unlikely to be true in many local elections, and even unlikely to be true in presidential elections once we have system in place that doesn't favor a two party system so strongly.

@Jack-Waugh said in STAR vs. Score:

STAR (like IRV) manufactures a fake "majority", which may pass muster in the courts, where Score would not.

Calling this a fake majority seems to miss the point of the final step of STAR. STAR isn't just to pass legal muster, it is because systems that expect voters to be strategic have all kinds of issues such as forcing a two party system.

Jack Waugh

@rob said in STAR vs. Score:

I don't see that as a positive.

The question here was of whether the systems would give different results when voters use the strategy that applies to the system they are faced with. Assuming the voters vote the same way in both systems does not necessarily provide the answer.

rob

@Jack-Waugh We can't know what strategy voters Score voters will use. Meanwhile STAR reduces (but doesn't entirely eliminate) the incentive to be strategic.

I assume you think that a strategic vote under Score will be based on the voter knowing who the front runners are. But unless you know 100% how others are going to vote, that's a bit tricky to know, isn't it? Especially if you are assuming that those other voters are using the same strategy, which means they need to know how you are going to vote. And those are obviously dependent on one another.

So the best you can get is a Nash equilibrium. It's possible that there will be multiple equilibria, which I would expect in the case of a Condorcet cycle.

Which basically means your question is unanswerable. Because Score (as you seem to acknowledge) demands strategy, it is intrinsically unpredictable. You don't just need to know what voters' preferences are, you need to know what their strategy is, and how much they know about others' preferences... and then add a bit of "hall of mirrors" style infinite recursion into the mix for good measure. (you can see my simulator of this recursive equilibrium seeking behavior at https://pianop.ly/voteSim/voteSim.html or a video -- of an earlier version that didn't yet have Score voting -- at https://www.youtube.com/watch?v=NiS2A0QLeJU )

Many of us think that you should be able to vote without concern for who the front runners are. We don't want to worry that inaccurate polling could easily throw the election. We don't want a tight three way race to turn into a game of chicken. We don't want voting methods to work significantly differently in elections for which there is (or isn't) a lot of media attention.

You say in another thread that anything that deviates from "one person one vote" is anti-democratic, and I would argue that you are violating that principle if you are giving extra voting power to those who are better able to guess how others will vote.

In any case, I'll just say this. If you are trying to sell Score to the public, while acknowledging that voters are expected to vote dishonestly (or insincerely, or strategically, or whatever you want to call it).... good luck. I can pretty much guarantee you that a system that allows such fine-grained expressiveness, but then strongly incentivizes voters to use that expressiveness to say something that misrepresents how they really feel, is not going to fly. There are a lot of people (including myself) for whom that just feels dirty.