A ranked method where ranking only a subset isn't bad strategy

cfrank

@rob that is quite an interesting consideration, and it gives voters some incentive to rank all of their preferences without making it necessary, since by doing so it is more likely that their lower preferences will become reinforced.

So do you mean, for example, that if you had four candidates a,b,c,d and isolated just three ballot types such as

c<b<a
d<c<b<a
e<c<b<a

that the first ballot types would also contribute half and half to each of the lower ballot types? Or that if you had something like

5 : c<b<a
5 : d<c<b<a
10 : e<c<b<a

That the first ballot types would be distributed one third to the middle ranking and two thirds to the bottom?

Or if you had

5 : c<b<a
5 : d<c<b<a
10 : e<c<b<a
5 : e<d<c<b<a

Perhaps the distribution would trickle down from the ballots with fewer entries.

rob

Note sure I understand your examples (sorry, pretty busy so I didn't spend a long time). But the idea is that ballots that list more candidates would contribute to ballots that list fewer candidates.

But here is an example. Say we have these ballots

10: A>B>D>C
5: A>B>C>D
20: A>D>B>C
15: A>C>B
10: C>B>A
10: C>B>D
10: C>A>D
1: A>B
2: A
1: C

The A>B ballot would be expanded to A>B>D>C, since that is the most common (non-expanded) ballot for those whose first two choices are A>B.

The A ballots would probably be expanded to A>D>B>C, since that is the most common ballot for those that simply have A as a first choice.

The C ballot would be expanded to C>B>A, since B is the second choice for most ballots that have C as a first choice, and A is ranked higher, on average, than D on more ballots that have C as a first choice.

You could imagine in a Bush-Nader-Gore race with ranked ballots, all the people who simply bullet voted for Gore would (probably) get Nader added as their second choice, and those who bullet voted for Nader would get Gore added as their second choice. It is hard to know what a bullet vote for Bush would get expanded to.

I'm not trying to get too far into specifics of an algorithm/formula, just give a basic idea. I would think an algorithm would be pretty straightforward though. (but not necessarily super efficient) I'd think it would work best with a fairly large number of voters.

Again, this is based on the general idea of collaborative filtering, which tries to predict peoples preferences on things they haven't rated, based on the ratings of people who otherwise rate things similarly. https://en.wikipedia.org/wiki/Collaborative_filtering

cfrank

@rob Yes I see what you are indicating. I was imagining the ballot types being distributed according to their relative frequency rather than simply choosing the mode. I can try to illustrate what I was imagining more effectively:

I'll take your exact distribution of ballots:

10: A>B>D>C
5: A>B>C>D
20: A>D>B>C
15: A>C>B
10: C>B>A
10: C>B>D
10: C>A>D
1: A>B
2: A
1: C

Here is the algorithm I think laid out explicitly, it is recursive so I will do a single pass through without repetition. Just to use some coding language let the orderings be indicated as finite strings, for example A>B>C>D would simply be ABCD. So the ballots you indicated would be equivalent to

10: ABDC
5: ABCD
20: ADBC
15: ACB
10: CBA
10: CBD
10: CAD
1: AB
2: A
1: C

(1) If there are any strings that do not include every candidate, consider all strings of the minimum available length. Otherwise, terminate the re-distribution.
---In this first pass, there are certainly ballots that do not include all candidates, so we note that there are three strings of the minimum available length, namely the 2 : A and the 1 : C.

(2) Take the set of strings of minimal length that appear as ballots. For each one of these strings, consider every other string that contains that string as a prefix (which might be called a "super-string").
---In this pass, the other strings that contain A as a prefix are

10 : ABDC,
5 : ABCD,
20 : ADBC,
15 : ACB,
1 : AB.

Therefore there are 51 total super-strings of 'A.' Take the total number of ballots indicating the prefix string ('A' in this case, of which there are 2), and distribute them among the super-strings according to their relative frequency. For example, 20/51 of the super-strings of A are ADBC, and there are 2 strings only indicating the prefix 'A.' Therefore, we would add 2*(20/51) additional tallies to the count of 'ADBC' strings. 10/51 super-strings of A are ABDC, so we would accumulate 2*(10/51) tallies toward 'ABDC,' etc.

(3) Repeat from step (1) until termination.

Computationally this is equivalent to multiplying the tallies of each super string by (1+P/S) where P is the number of prefix strings and S is the number of superstrings of the prefix.

rob

@cfrank Right... I mean, I probably wouldn't look at them as strings so much, but that's mostly an implementation detail.

I'd do something like this. This assumes 5 candidates (A, B, C, D and E) but I'm not bothering looking at my example.

First I'd build an object like this, which is basically a hash table of hash tables:

(note: I leave off the final ranking, so "ABCDE" is equal to "ABCD" with the E implicit)

{
 "AB": {
  "DC" : 22, // (22 ballots that start with AB end with DC)
  "CD" : 14,
  "ED" : 3,
  "EC" : 1,
  "D" : 35,
  "C" : 21,
  "E" : 3 // (3 ballots that start with AB only follow
           //  it with E and leave C and D tied for last)
 },
 "ABC" : {
   "E" : 1,
   "D" : 4
 },
 "A" : {
   "BDC" : 44
   // etc
 }
// etc
}

So (for example) for each ballot that is ABCD, you'd add a "BCD" to the "A" member (which means adding one to its value if it already is there), you'd also add a "CD" to the "AB" member, and add a "D" to the "ABC" member.

After that pass, I'd condense it to something like this:

{
  "AB": "DC",
  "ABC" : "D",
  "A" : "BDC",
  // etc
}

To do that you need to just do a weighted average on the rankings for each. (note that to do this you probably want to expand them into arrays)

Now you have a quick lookup to apply to each ballot. If you get an "AB" ballot, you can instantly append a "DC" to it.

cfrank

@rob but the ballots essentially are strings, I thought describing certain ballots as strict prefixes of other ballots was an apt way to illustrate the concept I had in mind. With your suggested scheme as it stands, each “incomplete” ballot is filled in with the mode of its fully-ranked superstrings. In other words, the dominant fully-ranked superstring gets the whole of the tally for each of its prefixes. What I’m getting at is I don’t see why not distribute the prefix ballots among all of their superstrings (including the less filled out ones) according to the relative frequency of each of the more extensive ballots.

rob

@cfrank said in A ranked method where ranking only a subset isn't bad strategy:

In other words, the dominant fully-ranked superstring gets the whole of the tally for each of its prefixes.

Not sure if you understand what I proposed. I will admit, I am striking a balance between simplicity of explaining/tabulating, and what might be truly optimal.

Let me try again, with a simple example, and no code (because the difference between arrays and strings is ultimately irrelevant).

Say this is all the ballots that list A as first choice:

2: A>B>C
3: A>C>B
5: A

In the simplest implementation, you'd just expand all of those last ones (the 5 that bullet voted for A) to A>C>B, because, of the ballots that start with A, there are more that rank C above B than those that rank B above C.

So you end up with:

2: A>B>C
8: A>C>B

A more sophisticated algorithm would, instead of just appending C>B to all 5 A ballots, divide them proportionally.

Now, you end up with

4: A>B>C
6: A>C>B

Of course, this works out nicely because there are exactly 5 of them, so 3 of them can be expanded to A>C>B and 2 can be expanded to A>B>C. In the real world, you still are going to have to do some rounding, but if you have thousands of ballots, the effect of that becomes less pronounced.

Still, the "best" ways I can think of for doing this are obviously inappropriate for a voting system for political elections, simply because you wouldn't want to write legislation to describe them. I've written collaborative filtering algorithms, for instance I did pretty well in the Netflix Prize contest some 14 years ago: https://en.wikipedia.org/wiki/Netflix_Prize

I can actually describe how I did that pretty easily, and I did in this article: https://www.karmatics.com/docs/evolution-and-wisdom-of-crowds.html (which got a lot of readership when it was published)

But I wouldn't want to write legislation for it.

cfrank

@rob what you just described now is exactly the same as what I interpreted originally. I also don’t think rounding would ever be necessary unless you had a bad computer, because you will only ever be dealing with rational numbers.

rob

@cfrank What I mean by rounding is this:

Say you have this

2: A>B>C
3: A>C>B
4: A

Using the proportional method, you should expand 2/5ths of the A bullet votes to A>B>C, and 3/5ths of them to A>C>B.

That leads to this:

3.6: A>B>C
5.4: A>C>B

Which.... well, it adds up to 10 votes. Is that ok? I don't know. Seems weird to have fractional votes.

cfrank

@rob I think it's perfectly fine, I don't see what the issue is. I personally don't find it weird at all, especially understanding the sensible mechanism that produced those fractions.

rob

@cfrank Ok, I guess I can't disagree with that, especially since I consider this all hypothetical anyway.