Rule X extended to score ballots

Hi,

Recently I have been interested in the method "Rule X," introduced in this paper as well as section 3.5 of this paper. There are rigorous definitions there, so in this post I'll try just to give intuition and informal descriptions.

I will describe how it works on approval ballots, then how I have attempted to extend it to score.

The basic idea is: each voter starts with $1 (or just voting power 1, but money is a more fun metaphor). A candidate costs a Hare quota $q to elect. We can interpret an approval from a voter as willingness to spend their remaining budget to elect that candidate. Rule X sequentially chooses the candidate who can be purchased for $q at the lowest uniform price (not exactly uniform since some voters may exhaust their entire budget).

It's quite similar to, for example, Sequentially Spent Score, so let's compare the differences (using approval ballots!). Say the quota is $20. There are 35 supporters of candidate A who indicated a willingness and ability to pay $28 at $0.8 each. There are 30 supporters of candidate B who indicated a willingness and ability to pay $30 at $1 each.

SSS would choose B on this round, since the indicated demand is higher. Rule X would choose A on this round, since the price per supporter is lower. For each method they remove $q from the budgets of the supporters of the winner, but they also do this in different ways.

SSS treats a voter's remaining budget more like a multiplicative deweighting (yes I know this is handwavey), and subtracts a fraction of the $q in proportion to that voter's indicated willingness and ability to pay. On the other hand, Rule X treats a voter's remaining budget more like a subtractive deweighting, and chooses to subtract uniformly the smallest price possible such that if all the supporters pay that (up to their remaining budget) it will equal $q.

The reason I am interested in this rule is its excellent axiomatic characteristics. It satisfies Extended Justified Representation (EJR) and is a logarithmic approximation to the core.

I have extended Rule X to scored ballots with the following principle: when a voter gives a candidate a score of s < 1, then all payments made by that voter to elect that candidate should be made with efficiency s, so if a voter has scored a candidate 0.3 then for price set at 0 < p <= 1 they will spend 0.3p. I'll illustrate this by way of example, but I can draw up a formal definition if the example does not make it clear enough what I'm doing.

There is a coalition of 22 voters for candidate A with $0.6 remaining who scored A as 0.7
There is a coalition of 31 voters for candidate A with $0.35 remaining who scored A as 1.

Say the quota is again $20. Then the price p is the solution to 22*0.7*min(p, 0.6) + 31*1*min(p, 0.35) = 20 so the price will be set at p = $0.458. The coalition of 31 voters will be exhausted and the coalition of 22 voters will spend 0.7*p = 0.32 each to be left with $0.28 remaining.

If there is no other candidate that can be purchased for a price of lower than $0.458 then A will be selected and the ballots will be spent as above.

You may ask "what if no candidate can get a full quota of demand?"
This is a good question. What the authors of the method suggest is just to give every voter a little money uniformly until some candidate can be purchased (this is seq-Phragmen). Alternatively, it could just iteratively selected the highest-demand candidate and completely exhaust the budgets of its supporters. Note that this is exactly what SSS and Allocated Score do, so they run into the same problem of 'unaffordable' candidates, just it is a little harder to see.

There is already significant discussion in those papers about Rule X in general. Of course, if there are glaring issues with the approval variant that would be important to hear, but what I am more interested in is an evaluation of my attempt to extend it to score ballots.

Does the scored variant have any holes in terms of quality or strategic behavior?
Is this the most natural way to extend Rule X to scored ballots or is there another that makes more sense?

Looking forward to hearing thoughts, and again if my informal definition-by-example is not sufficient I am happy to provide pseudocode or a real formula.

@brozai I'll answer my own question here:

The authors extend Rule X to cardinal utilities in this paper
https://proceedings.neurips.cc/paper/2021/hash/69f8ea31de0c00502b2ae571fbab1f95-Abstract.html

Looks like they do it the same way.

Keith Edmonds

@brozai
First let me say how happy I am to hear that an academic did not plagiarize SSS which is what I had originally thought. I have talked with Dominik and he knows of SSS.

I would love to see an example where SSS with Capping, SSS with scaling and Rule X all get different winner sets. I do not really care if SSS is deemed worse in the end as long as I get what credit I deserve.

In terms of semantics we use "ballot weight" for the budget. It is better for public communication for a number of reasons. $ makes sense for their budgeting problem but for fair elections it needs to be clear that voters get the same amount of "vote power"

The issue with no full quota looks very similar to what is attempted in Sequentially Shrinking Quotas. I think this was invented by @Marylander if memory serves. Perhaps that works well when mixed with Rule X.

@Jameson-Quinn Has made arguments that you do not want exact PR/EJR/Pricability. The logic being that strategic voting gives a bias and you want a system with an equal and opposite bias. He can maybe give a better justification but this was a key reason why Allocated Score was chosen over SSS. Or at least that was the argument which won me over. All this is in the absence of strategy. STAR was invented because Score is vulnerable to strategy.

Marylander

@keith-edmonds Ought the electowiki page for Justified Representation be updated with the Full Justified Representation concept that they came up with here?

Keith Edmonds

@marylander Yes, yes it should. Also adding the cardinal versions since the existing page only has the approval definitions.

Toby Pereira

@keith-edmonds said in Rule X extended to score ballots:

@Jameson-Quinn Has made arguments that you do not want exact PR/EJR/Pricability. The logic being that strategic voting gives a bias and you want a system with an equal and opposite bias.

I've always felt that the Justified Representation class of criteria are clumsy and not really the best criteria to be using. See this archived CES discussion: https://www.votingtheory.org/archive/posts?where={"topic_id"%3A564}

By the way, I think that the "purest" form of cardinal PR is COWPEA, although it has the potential drawback of electing candidates with different weights. But it can be used as a lottery method to elect sequentially as whole candidates. While using lotteries might be impractical in many cases, it does have the advantage of passing monotonicity, Independence of Irrelevant Ballots and the Universally Liked Candidate Criterion. Getting those three things together in a single method is very rare.

Marylander

@toby-pereira Do you think that EJR in particular is clumsy, or all of them in general?

Toby Pereira

@marylander said in Rule X extended to score ballots:

@toby-pereira Do you think that EJR in particular is clumsy, or all of them in general?

Yes, because as I said in the archived discussion - "all these criteria seem oddly weak in that they only make demands about a single voter, or in proportional justified representation just one voter per candidate that the group "deserves"."

In Appendix C there is some fun discussion as to the best completion method to make it exhaustive (if there is the problem of an unaffordable candidate).

They conclude that seq-Phragmen is no longer quality (although, I suspect that they extended it in the literal sense of the greedy algorithm, rather than extending it in a way that fits the metaphor of giving every voter epsilon power until a candidate is affordable)

The rule they recommend instead is quite similar to Sequentially Shrinking Quota, but instead of shrinking the quota after the fact, they just pick the quota at the beginning such that k winners will be affordable. I'm calling it Shrunken Quota just preliminarily.

They do not seem to consider largest remainders, which is curious because it is probably the most 'obvious' completion method, so maybe they found something undesirable about that approach.

Keith Edmonds

FYI there is a wikipedia page (well a draft)

https://en.wikipedia.org/wiki/Draft:Method_of_Equal_Shares

@toby-pereira Is your issue that JR-like criteria are clumsy or that they are weak?

I find the definitions pretty appealing, although there may be other important parts of the story to consider (e.g. stability, Pareto).

There are many ways to strengthen ideas like PJR, although they usually are not satisfiable over the entire ballot domain, so you can also look at compliance of a method when it is possible.

I see JR / PJR / EJR as more of a litmus test; passing it does not tell you a lot about the method, but NOT passing it requires a very good explanation for the method to be considered further.

Toby Pereira

@brozai I suppose I find them both clumsy and weak. Take Justified Representation. It refers to the case where a Hare quota of voters all approve one candidate. Intuitively, you might think that as a solid Hare quota, they should all get this candidate elected. But further reflection suggests this isn't always possible.

For example, there might be two candidates to be elected in an election. Three candidates are approved by 50%, 51% and 51% of the electorate respectively with no correlation between voting for one or any of the others. In this case the two on 51% would be elected, so not the candidate on 50%.

Of the voters who approved the 50% candidate (it has a full Hare quota), slightly under a quarter of them would not have a candidate elected.

It would be impossible to guarantee that every voter in a Hare quota group gets a candidate elected. So how do we make a criterion around it? Well, they went for the minimum possible option - one of the voters must have a candidate elected. At that stage it just seems like a pointless nod towards something they were probably trying to achieve but realised they had to back out of.

@toby-pereira I think this example actually shows my point. In this case, (with very high probability if the approval sets are truly uncorrelated) any two of the winners will satisfy EJR (and thus also PJR/JR), so it is not restrictive at all. I'm sure you have seen this paper thrown around before, but it gives some nice examples of how two different committees, both satisfying EJR, are not necessarily all of the same quality.

Unless I'm misunderstanding the setup, I'm not sure why you are saying the two 51% will necessarily be elected (although, in this case it does seem like the 'right' choice).

I agree that it is a somewhat weak criterion in the grand scheme of things, and I would like to simultaneously optimize a metric like maxPhragmen (or related). I don't see them as clumsy at all though. In fact, every quota-spending rule (that sequentially selects a candidate with >= quota of support when they exist) satisfies PJR and thus JR.

You can find the versions of JR I think you're referring to as "strong" and "semi-strong" JR, where a solid Hare quota gets exactly that consensus candidate elected. Unfortunately, as you said, it's not always possible.

Edit: I should probably mention there is another intuitive criterion, perfect representation. This is when the voters can be exactly divided into quotas such that each quota gets a unanimous winner. Obviously, this is also not always possible, but more importantly it is incompatible with EJR. This is one reason maybe it's reasonable to consider EJR 'clumsy.' However, it is compatible with PJR. It seems to me that PJR is weak enough such that any noncompliance is likely indicative of deeper problems. In particular, optimization of the maxPhragmen metric implies PJR. Your 'squared load' metric I believe is equivalent to the varPhragmen objective function, which implies JR.

Ted Stern

@keith-edmonds @brozai , is there a way to get the Wikipedia page author to also put a page on electowiki.org? Seems like it might be faster since no approval would be required.

Marylander

@andy-dienes said in Rule X extended to score ballots:

I see JR / PJR / EJR as more of a litmus test; passing it does not tell you a lot about the method, but NOT passing it requires a very good explanation for the method to be considered further.

Although didn't the paper say that anything passing EJR was going to be difficult to compute efficiently? That suggests that for some applications, there is a good explanation to consider sacrificing EJR: you can't spare either the time or the complexity.

On the other hand, that's still no excuse to use a method that doesn't pass PJR, even a sequential one.

@marylander I suppose it depends on how you define 'efficient.' Rule X satisfies EJR, and I would not say it is much more complicated than any other quota-spending method! Computationally it is the same amount of work.

no excuse to use a method that doesn't pass PJR

Completely agree. This is why welfarist rules like RRV are basically a non-starter in my eyes.

Keith Edmonds

@andy-dienes said in Rule X extended to score ballots:

This is why welfarist rules like RRV are basically a non-starter in my eyes.

Agreed. So it comes down to SSS, MES and Allocated score.

Consider this 5 winner example with clones for each candidate
Red: 61% vote A:5, B:3, C:0
Blue: 39% vote A:0, B:3, C:5

RRV Gives ['A1', 'C1', 'A2', 'B1', 'B2']
MES Gives ['A1', 'A2', 'A3', 'C1', 'B1']
SSS Gives ['A1', 'B1', 'B2', 'B3', 'B4']
Allocated score Gives ['A1', 'B1', 'A2', 'B2', 'A3']
STV Gives ['A1', 'A2', 'A3', 'C1', 'C2']

I could have made a calculational error but I did it with code which I can post if people want to look for bugs. If correct this is super interesting. They all give different results.

Which sets are in the core? If any?

Keith Edmonds

OK lets look at these results more deeply

RRV Gives ['A1', 'C1', 'A2', 'B1', 'B2']
-- Total Utilities for each group of 16 and 11

MES Gives ['A1', 'A2', 'A3', 'C1', 'B1']
-- Total Utilities for each group of 18 and 8

SSS Gives ['A1', 'B1', 'B2', 'B3', 'B4']
-- Total Utilities for each group of 17 and 12

Allocated score Gives ['A1', 'B1', 'A2', 'B2', 'A3']
-- Total Utilities for each group of 21 and 6

STV Gives ['A1', 'A2', 'A3', 'C1', 'C2']
-- Total Utilities for each group of 15 and 10

The fist thing to check for is if the winner sets are stable. The larger group is 61% so it should control 3/5 of the seats. The best set of size 3 for them is [A,A,A] giving utility of 15. Each set above gives the larger group at least 15 so [A,A,A] does not block. Similarly for the smaller group controlling 1 seat [C] does not block. For sets of size 5 it must be preferred by every voter. The set given by SSS blocks both STV's and RRV's set meaning that those sets are not stable. I think the other 3 are.

I am not sure how to choose between the three remaining sets but it is interesting to consider that the members of 61% group would be expected to have about 61% of the total utility. For STV this is nearly exact since 15/(15 + 10) = 60% this is why people tend to think it does well.

RRV: 16 / (16 + 11) = 0.59
MES: 18 / (18 + 8 ) = 0.69
SSS: 17 / (17 + 12) = 0.58
Allocated score: 21 / (21 + 6) = 0.77
STV: 15/(15 + 10) = 0.60

Interestingly RRV and STV both do very well even though they were eliminated by being unstable. SSS does well too but MES and Allocated Score do not.

Another way to think about it is total utility
RRV: 16 + 11 = 27
MES: 18 + 8 = 26
SSS: 17 + 12 = 29
Allocated score: 21 + 6 = 27
STV: 15 + 10 = 25

Again SSS is best. As the inventor of SSS I am trying to not be biased. Is there something I am missing? Is this just a special case? Perhaps we should simulate it like we did before to see what typical results are

@keith-edmonds Purely qualitatively, I think given this profile, Red needs to win more than Blue since it is equally cohesive and 50% larger. Again qualitatively, I think a cohesive group representing 40% of the population should get at least one of their top winners.

This leaves AAABC (from MES) and AAACC (from STV) as what I would identify as the 'best' performance on this particular ballot profile. However, this one is a bit of an edge case and the numbers work out very closely. I always feel uneasy about making too large conclusions from single examples, since any method can be made to look bad.

The results will also look extra weird because the ratios of supports do not play well with 5 winners for this specific scenario. With 4 winners I think AABC will look very reasonable, and with 6 winners AAABCC or AABBBC is right.

In particular, if the Red voters are strategic really at all, it looks like they can squeeze out more winners in RRV and SSS by burying B. I am interested in using the definition of "balanced stable priceability" here https://www.cs.toronto.edu/~nisarg/papers/priceability.pdf to measure stability---I am not sure how or if it relates to the definition based on blocking sets.

Marylander

@keith-edmonds said in Rule X extended to score ballots:

Interestingly RRV and STV both do very well even though they were eliminated by being unstable. SSS does well too but MES and Allocated Score do not.

On the other hand, RRV and STV choose winner sets where all voters are strictly worse off than under the SSS winner set, so if we make the assumption* that the sum of the scores can be used to determine which overall committee the voter would approve, then could be interpreted as quite a bad example for RRV and STV.

* I'm not calling it an unreasonable assumption, but it is an assumption and so I'm stating it. Perhaps we could test it with surveys, although in my opinion the meaning of scores depends on the voting system to some extent, so it might not be easy.

Edit:

@andy-dienes said in Rule X extended to score ballots:

@keith-edmonds Purely qualitatively, I think given this profile, Red needs to win more than Blue since it is equally cohesive and 50% larger. Again qualitatively, I think a cohesive group representing 40% of the population should get at least one of their top winners.

I started writing this post before you replied @Andy-Dienes (so I posted before I could comment on what you said), and I just want to observe that this objection I think relates to my discussion about about whether we can assume that "greater sum score = better winner set" at the level of the individual.