GPT and I invented a new voting system metric?

Toby Pereira

I meant to reply to this ages ago, but basically we're looking at Condorcetness of a voting method, and how close to the Condorcet winner the winner of a method generally is.

I think it's arguably of less practical use than utility, because if Condorcetness is what you want, then you just adopt a Condorcet method. Whereas if utility is what you want, there isn't such an obvious path to it.

Having said that, it might be possible that strategic voting can cause the Condorcet winner not to be elected in some situations with a Condorcet method, so you might want to find the method that best maximises this thing you're measuring under certain strategic assumptions. In fact, Warren Smith would argue that score and approval voting might even be better at electing the Condorcet winner than Condorcet methods. See e.g. this.

Also in terms of Condorcetness, there is the "Game Theory" method, which is arguably the ultimate in Condorcet as I said in this post on the old CES Google group. That method also got discussed in this thread.

Toby Pereira

@psephomancy Just to add a bit more to this then - It's quite difficult to come up with the Condorcetness of a winner. Different Condorcet methods have different ways of determining a winner, so when there isn't a Condorcet winner, they can pick different winners. But by their own measure you'd say that they are picking the "most Condorcet" winner.

For example, you talk about the number of pairwise defeats. That's basically Copeland's Method, so would consider the Copeland winner to be the most Condorcet, but it fails independence of clones, and is generally not considered to be great from a theoretical point of view.

Similarly there's Minimax, which elects the candidate that has the smallest pairwise defeat (if there isn't a Condorcet winner). So your measure would be based on the size of the winner's worst single pairwise result. But this also fails independence of clones, among other criteria.

Then you have methods like Ranked Pairs and Schulze, which are known for their criterion compliance. However, candidates don't end up with a score that can be compared with the score of the best winner.

But we can perhaps use the Game Theory method, which I previously described as the ultimate in Condorcet. When there isn't a Condorcet winner, it is non-deterministic and picks between certain candidates with certain probabilities. But no strategy can beat it in the long term by the measure they are using (which I think is average pairwise win/loss).

So a method picks a winner, and you compare that winner against the Condorcet winner to see the pairwise result (which is just a draw if they are the same candidate). If there isn't a Condorcet winner, you compare the method winner against the Game Theory strategy overall. So if under the Game Theory method candidate A wins 50% of the time, B 40% and C 10%, you just looked at the weighted average result against these candidates.

So you now have this pairwise result. Say it's the margin of defeat or just 0 if it is the Condorcet winner. To turn it into the Pairwise Ranking Efficiency measure in the same way as the utility version, take the difference between that and the average defeat (use the positive value if better, negative if worse) and just divide by the average margin of defeat against the Condorcet winner (or lottery profile).

For example, candidate A wins by some method. But A is not the Condorcet winner and is beaten by a margin of 10 votes by the Condorcet winner. The average margin of defeat against the Condorcet winner for a candidate is 30 votes. So the PRE is (30-10)/30 = 2/3.

An alternative would be to use median margin of defeat rather than mean.

Psephomancy

@toby-pereira Criteria and strategy aren't relevant though; it's just a measurement of the "goodness" of the candidate.

Toby Pereira

@psephomancy said in GPT and I invented a new voting system metric?:

@toby-pereira Criteria and strategy aren't relevant though; it's just a measurement of the "goodness" of the candidate.

I was thinking this measure would be used to measure the quality a method in general (average score of winner), as is done with utility measures, not just individual winners on a one-off basis. It was all there as context, with my potential answer in the second post.

Edit - But to be crystal clear about the criteria - I discussed the criterion compliance of Copeland and Minimax to discuss whether they were the best measure of Condorcetness. We have to pick a measure of Condorcetness, and by doing that you are inevitably picking a Condorcet method. And so I was saying that I wouldn't want to pick a measure that wasn't cloneproof. On the other hand, the Game Theory method is arguably the ultimate Condorcet method, so it might make sense to use the measure from that method.

multi_system_fan

@toby-pereira https://jamesgreenarmytage.com/dodgson.pdf seems like an interesting improvement

Toby Pereira

@multi_system_fan It's quite interesting, although allowing more than one round of voting changes things quite a lot, and there are probably also other alternatives that might be as good or better.

I don't think you could apply it to the metric being considered in this thread though.

Psephomancy

@toby-pereira Sorry I wrote my previous short comment in line at the grocery store and forgot about this thread.

Yes, we need a measure of "Condorcetness" or "pairwise bestness". But, like the raw sum-of-utility measure, it doesn't need to be resistant to strategy or motivated by similar concerns that would apply to an actual voting system. It is only motivated by the philosophical "goodness" (representativeness) of the candidate, but I don't know Condorcet systems enough to know what that would be.

Toby Pereira

@psephomancy Yes, I agree about the resistance to strategy. But I see a failure of independence of clones as separate from strategy. I think if a measure isn't cloneproof it's probably not a good measure. Also, Copeland is very low resolution anyway in that it just looks at number of defeats rather than the size of any of them.

Psephomancy

@toby-pereira said in GPT and I invented a new voting system metric?:

I think if a measure isn't cloneproof it's probably not a good measure.

Why would that matter for a measure?

Also, Copeland is very low resolution anyway in that it just looks at number of defeats rather than the size of any of them.

That makes sense.

Toby Pereira

@psephomancy said in GPT and I invented a new voting system metric?:

@toby-pereira said in GPT and I invented a new voting system metric?:

I think if a measure isn't cloneproof it's probably not a good measure.

Why would that matter for a measure?

Because you can have a candidate that is the closest to being the Condorcet winner but not the Copeland winner. E.g.

14: A>B>C
4: B>C>A
12: C>A>B

A>B - 26:4
B>C - 18:12
C>A - 16:14

A has the biggest winning margin and smallest defeat, and is the nearest to a Condorcet winner by any reasonable measure. But then you can clone C and have a C1 and C2 but where C1 is always ranked above C2.

In this case A now has two defeats (against C1 and C2) so loses to both B and C1 in Copeland. But A is still the nearest to a Condorcet winner in terms of defeat sizes, so I would say they are still the "most Condorcet" winner.