Handling non-deterministic tie-breaking in voting criteria

BTernaryTau

I'm trying to figure out what the best way to handle ties would be for the sequential cancellation criterion. For the cancellation criterion, I was able to get around this issue by treating voting methods as functions without worrying about the specifics of what those functions output. Thus, voting methods that break ties randomly could simply output something like a weighted collection of candidates, and the cancellation criterion would be satisfied if that output stayed the same, even if the non-deterministic outcome did not. However, the sequential cancellation criterion has to set a specific output format since it cares about whether an individual candidate is elected at the same time across multiple elections, so that approach isn't an option.

My initial thought is to have the voting methods output a weighted collection of lists, then require that the individual lists satisfy a modified version of the sequential cancellation requirements. But this adds a lot of complexity to an already complex criterion. Another option would be to just restrict the criterion to deterministic methods and have those methods handle ties by violating neutrality (e.g. breaking ties by alphabetical order). Are there any better ways to handle this issue?

Marylander

@BTernaryTau If a method is sequential, then for a deterministic method, I would treat the output as an ordered list of candidates. For a nondeterministic method, I would treat the output as a discrete probability space, where the outcomes are the different possible ordered lists with length equal to the number of winners, and the probability for each list is the probability that it represents the list of winners in the order that they are selected.

Then given a set of cast ballots B and a method m, we can define random variables W_B,1, W_B,2, ..., W_B,w (where w is the number of winners) representing the candidate who wins each round.

I would define the criterion as: for every set of cast ballots B, for every ballot b, for every nonnegative integer k<w, for every ordered set of k candidates W such that P({W_B,1, ..., W_B,k} = W) > 0, there exists some ballot b' such that both of the following hold:

For all positive integers i <= k, b(W[i]) = b'(W[i]) (referring to entries in W, not W_i)
If B' = {b_1, ..., b_n, b, b'}, and P({W_B',1, ..., W_B',k} = W) > 0, then for all candidates c, P(W_B,k+1 = c | {W_B,1, ..., W_B,k} = W) = P(W_B', k+1 = c | {W_B',1, ..., W_B',k} = W).

Essentially, a ballot b' can only cancel a ballot b at a round assuming a specific list of previous winners. (Since the method is nondeterministic, there might be multiple possible lists.) Each ballot b should have (possibly, but not necessarily different) cancelling ballots for all possible sets of winners. For b' to cancel b for a particular list of previous winners, the two ballots must give the same score to all candidates on that list, and adding them to the set of ballots cast either not affect the probability of electing any candidate next conditional on the list of previous winners being as assumed, or make it impossible for that list of previous winners to still be selected.

BTernaryTau

@Marylander Thank you for this! This is exactly the criterion I was picturing for handling non-deterministic methods. I have decided I'm going to keep the deterministic version of the criterion as the default, but this is definitely worth mentioning as an extension.

BTernaryTau

Coming back to this, I'm wondering if a better approach would be to model non-deterministic voting methods as functions that take a random key as input in addition to the ballots, similar to the way that pseudorandom functions are handled. Then if the deterministic sequential cancellation criterion is passed for every possible key, we can say that the non-deterministic method passes sequential cancellation.

If I'm correct, this version of the criterion is equivalent to Marylander's version but isolates the randomness to a single variable (the key) in a way that allows it to be ignored for the most part. I believe this makes the criterion easier to reason about, and this feels like the approach that I was searching for when I started this thread. Is there any reason to avoid it?

BTernaryTau

Alright, I've determined that this approach would not behave exactly like Marylander's version. Specifically, it's possible to have two voting methods with the same probability of electing each list of candidates in every election and yet have only one of them pass the random key version. As a concrete example, here are two implementations of breaking approval voting ties uniformly at random:

break an n-way tie by picking the ith candidate, where i = key mod n
break an n-way tie by picking the ith candidate, where i = key + (total number of approvals) mod n

Here is a simple example election:

1: approves A and B

And here is a pair of cancelling ballots:

1: approves A and C
1: approves B

For the first tie-breaking implementation, i stays the same when the cancelling pair is added, so the tie is broken the same way and no violation of sequential cancellation occurs. For the second tie-breaking implementation, i is flipped by the addition of the cancelling ballots because the total number of approvals increases by 3. Thus, the tie is broken in favor of the opposite candidate and sequential cancellation is violated. This behavior generalizes such that the first implementation passes key-based sequential cancellation while the second does not.

This is unfortunate because one of the intended advantages of the sequential cancellation criterion is that if two voting methods always produce the same results, either both will pass it or both will fail it. This is already weakened a little by requiring the order in which candidates are elected to remain the same, and under the random key implementation it would need to be weakened more by requiring the candidates elected to remain the same for every key.

As of now, I still prefer the key-based version to the fully probabilistic version, but this seems like a major downside. I'm not sure if there is any way to combine the advantages of both approaches, but if there is a way it would be very helpful.

Marylander

A deterministic approach that just treats ties as their own kind of outcome might be able to gain the advantages of the probabilistic approach while being able to dispense with the randomness. I didn't try to do that because runoff methods can end up with sequences of tiebreakers so that all candidates that might be involved in a tiebreaker might not have equal chance of winning overall, and also to make the criterion applicable to methods that aren't even mostly deterministic, such as random ballot.

BTernaryTau

Ok, I think I figured out how to get around the key-based version's issue. The problem is that the key is essentially exposing the voting method's internal randomness mechanism to the criterion rather than treating it as a black box. So to get around that issue, we can require that, for some non-deterministic method m, there exist a function f(k, e) that passes the key-based version of the criterion such that for all elections e and candidates c, P(f(k, e) = c) = P(m(e) = c). That way we can isolate the randomness of f to one variable just as before while still treating the voting method itself as a black box.

@Marylander do you see anything I'm missing here?

Marylander

@bternarytau

This is a little nitpicky, but the notation

P(f(k, e) = c)

implies that k is random (because if k is fixed, then this probability would have to be either 0 or 1), but this is not stated.

Assuming that there are finite possible seeds and that each seed is equally likely to be chosen, we could write:
P(f(k, e) = c) = sum over s in S (I_c(f(s, e)))/|S|,
where S is the set of seeds, I_c(f(s, e)) = 1 if f(s, e) = c and I_c(f(s, e)) = 0 otherwise.

For many methods, |S| might need to be chosen in a way that depends on some limited information about the election, in particular the number of candidates (since for there to be an even 3-way tiebreaker, |S| needs to be divisible by 3 and so on). Methods that are "routinely" non-deterministic (such as random ballot) might also require the number of votes to select |S|.

Marylander

This post is deleted!