Mathematical Paradigm of Electoral Consent

cfrank

@brozai thanks, I’ve certainly tried to do a lot of thinking, but it’s been mostly done alone so I’m trying to check myself and get some criticism and outside perspective.

My intention isn’t necessarily to satisfy particular pre-existing criteria, but to provide an alternative unifying paradigm for analysis. For example based on my own experiments, the weighted PFPP algorithm tends to align very often (but not always) with STAR. However STAR is not designed to optimize statistical measures regarding SP frontiers. I think what constitutes a failure (in the social sense, not the axiomatic sense of simply not satisfying certain formal properties) depends on what paradigm you subscribe to. Maybe apparently contrary to my “first-principles” talk, I’m of the belief that each voting system should be taken on its own merits and evaluated based on results rather than criteria.

For example, weighted PFPP does not satisfy the majority criterion, but that’s actually sort of the point. I don’t believe the majority criterion is a good thing as compared with building a broader/more inclusive consensus of the electorate.

Any SP efficient OSS should satisfy all of the typically enjoyed properties of any cardinal score
system. (Except perhaps “Frohnmeyer balance,” but as I have demonstrated elsewhere and intend to organize in this paper, that criterion is impotent).

Weighted PFPP with updating or relevant/informative probability distributions generally mitigates bullet voting and emphasizes the power of broad consensus over majoritarian strategy. Consistently utilized strategies will become noise as the system updates its distributions or as the distribution takes account of the frequency of certain types of candidate score profiles.

I wonder if you have any specific axioms in mind. The main goal is to emphasize broad consent over majoritarianism without eliminating the expressiveness of ballots.

@cfrank Well, for example, this method seems extremely vulnerable to burial. Also, I would be surprised if it satisfies any form of Participation, even the weaker ones.

There are definitely some interesting ideas here, and I like how you are trying to look at the entire distribution of scores rather than just the average. However, the fact that it does not coincide with majority rule on 2 candidates makes it already a non-starter for me.

cfrank

@brozai I am curious about why participation would surprise you. Giving a candidate a higher ordinal score will only increase their chances of winning the election. For example, if I score a candidate as S4, then my indication raises the SP ceilings of the candidate at S1, S2, S3, and S4 because of the way SP consent is defined (S0 doesn’t really matter, it can’t be raised or lowered).

Also just know that I am trying to make the strongest case for this concept as I can without being dogmatic, so if I come off that way just let me know. My image for this system incorporates past data into distributions that are more or less stable. In fact the distributions could only be allowed to update after each election, rendering each individual election deterministic but the whole sequence of electoral processes less predictable.

Burial does seem to be an issue, although some form of tactical voting is bound to make an appearance. Not to dismiss it—burial is serious. It is also a risky strategy though, and a STAR modification could help curb the incentive. Also I think effective burial requires information about front-runners, and if there are many different platforms or if the system is multi-winner it becomes less plausible for a rational voter. Still, voters are free to be irrational and/or risky. If you have any other considerations about that or about my rationale there I think it would be constructive for me to hear.

In terms of the majority criterion, I think we may just operate on different paradigms, and just to clarify, the example I gave in the pamphlet was between two candidates who may be in the context of a larger election. There are examples in real life where a majoritarian victory even in an election between only two candidates would be totally anti-social, like the pizza topping problem (3 people plan to pitch in equally for a single-topping pizza, but 2 people prefer a topping the 3rd is allergic to, for example).

There is a book called “Patterns of Democracy” by Arend Lijphart where the distinction between majoritarian democracy and consensual democracy is made clear, and it’s also made clear that consensual democracies are more highly correlated with superior social outcomes than majoritarian ones, and I think that makes sense.

The more I think about it, the more I feel like something like proportional representation makes sense. I just also think that it would be ideal to have the choices of representatives be as “consensual” as possible, but it isn’t easy to determine what exactly that is supposed to mean.

Marylander

@cfrank
So far I have read the first 5 pages. Here is what I think so far.

Given that the document has the appearance of a scientific paper, it is a bit weird that there are no citations.

Some of the history in the beginning of the paper seems broad and tangential to me. I could be wrong about this, though, so I ask other people who read this to check and see if they agree.

I think that some of the formalization on page 5 is incorrect. I don't think you can take the candidate set to be infinite (at least, not without some further conditions) because it might be impossible to find a winner, for example if every voter prefers C_i to C_j for i < j and the number of candidates is countably infinite, the Pareto criterion would forbid any candidate from being elected.

Extending the number of voters to infinite cases I think might also require some conditions as I suspect issues related to convergence and measurability might come up if it is done haphazardly.

cfrank

@marylander that’s sensible about the presentation. The pamphlet is not complete and I intend to provide citations where appropriate, but I’m not sure what citations would be needed.

The broad overview is intended for people who are not necessarily familiar with voting theory, and the purpose is to establish the context of the document. I agree that some of it is tangential and I intend to make changes.

Also the formalism is not incorrect, it would just require an appropriate decision algorithm to select a winner from a continuum or may not allow certain criteria to be satisfied in certain cases as you indicated. But for example, if the candidate set is a collection of points in a plane, and the voters assign each candidate a score from a continuum according to distances from certain ideal points, then the decision algorithm might select a candidate that minimizes some chosen objective function of the scores.

But that’s not super relevant anyway, since only finite sets are considered.

@cfrank You may be interested in this notion of "generalized Condorcet winners" via "Borda dominance." A paper on the topic is here https://www.jstor.org/stable/43662517 (let me know if you don't have access and I will get PDF)

Your proposal, and in particular "SP dominance" reminds me a bit of this idea.

Marylander

@cfrank

@cfrank said in Mathematical Paradigm of Electoral Consent:

The pamphlet is not complete and I intend to provide citations where appropriate, but I’m not sure what citations would be needed.

Your historical discussion on pages 2-4, for one thing. The definition of STAR voting also probably deserves a citation. Your discussion of the relevance of these ideas to politics also might merit some citations.

Pages 6-7:

The following stipulation is adopted: That if one intends to utilize probability
measures to establish a decision algorithm for an OSS in a democracy, any
utilized probability measure imposed on the electorate should be uniform.

In what way would you impose a probability measure on the electorate? What would you do with it?

In your definition of SP-consent ceiling, did you mean R >= S instead of R > S?
I assume that R needs to be an element of S. Using R > S can lead to some consequences that I am not sure if you intended. For example, in an Approval election in which a candidate gets 65% approval, (0, 0.65) is part of the SP-consent ceiling, as is (0, r) for r in [0.65, 1].

Pages 7-8: This seems to be a lot of loose threads. I think you need to find a point and stick to what relates (although not necessarily supports, discussing contrary perspectives is fine). Things like the role of decision algorithms in machine learning probably should go in its own discussion at the end that could discuss alternate applications of these ideas.

@marylander In a similar vein to "loose threads," I think the connection to compression algorithms is supported only by the fact that the set of winners is smaller than the set of candidates. There might be a stronger philosophical argument to relate proportional representation committees to compression algorithms, but for single winner schemes I cannot see it.

cfrank

@marylander that makes a lot of sense, I can definitely find good citations for all of those things. Thank you.

In terms of the probability measure, the electorate as I have defined it is a finite set of objects called voters, and any finite set can be equipped with a probability measure to turn it into a probability space. In terms of how it is used, that depends on the decision algorithm. It isn't easy to formalize the concept because decision procedures can get really wild, and I would have to restrict the scope to a specific kind of decision algorithm to say anything much more meaningful. I tried to connect it with Lewis Carroll's desiderata but it isn't formal. It might just be unnecessary.

For the SP consent ceilings you are correct that my meaning has an anomaly, you are also absolutely correct about the intended meaning.

Thank you for your input, I have these concepts floating around in my head so trying to put them down on paper and running them by other people who are knowledgeable and have a fresh perspective is very helpful.

cfrank

@brozai I think it depends on your perspective. Just as a rough example, one could create a formal model of voters and candidates as having "investment" distributed over "interests," i.e. a set of "interests" and letting each voter be essentially a probability distribution over those interests. Then one could take the sum total of those interest distributions and create an "electoral interest" distribution.

If each candidate is also a probability distribution over those interests, choosing a candidate can be seen as more or less projecting/compressing the electoral distribution into the set of distributions determined by the candidate pool. With this conception a voting system functions exactly as a compression algorithm.

Real life is more complicated than that but I hope that illustrates my thinking better---a candidate's platform can be seen as a (high quality or lousy) compression of electoral interests.

@cfrank You seem very intent on reformulating all the language, definitions, and algorithms used for voting in terms of probability measures and random variables. Out of curiosity, and I hope this doesn't come off as confrontational, why is that?

It is not less rigorous to just use the conventional definitions used in social choice theory where ballots are weak orders and so on. Similarly, I do not see the use in considering generalizations of the voter / candidate sets to be of arbitrary (infinite) cardinality.

It could possibly be of interest to study limiting behavior of voting rules as the number of candidates or voters grow---for example, studying the probability of a tie or of a Condorcet cycle in the limit of these quantities---but I don't think that's what's happening here.

rob

@brozai said in Mathematical Paradigm of Electoral Consent:

You seem very intent on reformulating all the language, definitions, and algorithms used for voting in terms of probability measures and random variables.

Hopefully this won't come off as a pile-on, but this was something I was confused about as well (regarding a previous paper), it seems way more complex than it needs to be. I compared it to saying that a "randomly selected point in a glass had a 75% chance of being occupied by liquid," as opposed to simply saying that "the glass is 75% full." It strikes me as a very roundabout way of expressing a simple concept.

cfrank

@brozai I am not trying to reformulate all of the language, definitions, and algorithms for voting in terms of probability measures and random variables. I am proposing a specific paradigm that happens to include an intimate incorporation of probability theory, along with a few specific voting systems that fit nicely within that paradigm. Connecting with a larger mathematical framework allows use of the powerful tools that belong to that framework, and probability theory seems appropriate to me.

I agree that it is not less rigorous, it's equally rigorous. It's just the way I think and express myself, probably because my background is in pure math. If you see a more apt way to describe the concepts I am proposing I would definitely like to hear that. I want to find a higher level of abstraction that can maybe unify some of the things we're looking at in voting theory, if I could find a good theoretical foothold I would be using category theory, but I don't want to go too far off into abstract nonsense that nobody wants to look into.

I don't actually agree that it is very much more complex than it needs to be. As I mentioned in the introduction of the pamphlet, connecting voting theory with probability theory is nothing new. Condorcet was one of the pioneers of voting theory and his Jury Theorem is a direct application of probability theory to voting theory. As another example, Nash's equilibrium theorem is a direct application of probability theory and topology to game theory.

The use of generalization is just that it is more general, and might be more amenable to application in other areas. I mentioned machine learning as one such area. And I want to point out that ordinal scores are different from weak orderings.

@cfrank I have a pure math background as well, so trust me I'm no stranger to painful detail & abstraction

I agree that there can be interesting connections to probability theory. The Jury Theorem is a great example.

In this case, I am not convinced it is necessary or instructive to introduce any additional tools or definitions beyond what is already commonplace in voting algorithms.

In the words of Dijkstra

The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise

And I believe we already have the tools to be absolutely precise with regular old ranked ballots over finitely many candidates. Sticking with conventional nomenclature and concepts will help people understand your proposal much more easily, and will also help contextualize it and compare to other methods.

In particular, I believe SP dominance is in fact equivalent to Borda dominance, and I believe your weighted PFPP scheme is in fact equivalent (edit: not equivalent since we have to allow skipped rankings, but very closely related) to a positional scoring rule, but these connections are very hard to see underneath all the new definitions and unnecessary framework.

It's possible I am misinterpreting something and that the equivalence I suggest above is invalid, but if this is the case I would find it very helpful to my understanding if you could provide an example where they differ !

cfrank

@brozai I want to look into that Borda dominance scheme and see if it is different from my proposal. (EDIT: I totally do have access)

PFPP may be equivalent to a positional scoring rule at each election, but the prescription of the particular scoring rule and how it is allowed to change from one election to the next according to informative distributions is what makes PFPP different. For example, a thought I had earlier today was actually that if the distributions are allowed to update, then as fewer people over-use the higher score values, they become more potent when they are used. This can give voters an even stronger incentive against strategic bullet voting, since it will weaken their vote in the future when they may actually feel strongly about a candidate.

I'll read the PDF paper you linked and see if they coincide, if they do then I'll be happy because that means probably more analysis has been done on this system! Otherwise I'll try to illustrate points where I find that they differ. I think already the fact that the winner is called "generalized Condorcet" points to something different, since the methods I am proposing (at least on the surface, I could be wrong) have nothing to do with the Condorcet criterion.

@cfrank

PFPP may be equivalent to a positional scoring rule at each election, but the prescription of the particular scoring rule and how it is allowed to change from one election to the next according to informative distributions is what makes PFPP different

Ok, fair enough, but I can't comment on whether or not this is a good thing. It certainly would be a radical reform to current elections.

I sent the paper to the gmail attached to the google drive you shared, so let me know if you don't receive it. Unfortunately they don't do a ton of analysis besides introduce the concept of positional scoring dominance and then prove what types of dominance are actually constructible (what they call "Condorcet words"), but it is relevant I think nonetheless. In the language of this paper I think SP dominance corresponds to the k-Condorcet winner in the case where k = n.

cfrank

@brozai on second thought, PFPP and its weighted variants definitely are not positional score systems even without accounting for the potentially changing distributions. It is only a positional score system if the distributions used for the random SP ceiling heights are uniform. I think the explanation of the system will become clearer with visuals.

In any case it could very well be that being a k-Condorcet winner when k=n is equivalent to being a unique candidate that is not SP dominated. I’m not sure! Still working through the paper.

I tried to give an explanation of the unweighted PFPP system a while back through a video. It may help, if you were interested, but I understand if it’s not your cup of tea! This is the video:

https://app.vmaker.com/record/SGSydGYcwOW9Vf6d

It’s like 20 minutes… 10 if you do x2, potentially less if you skip around.

On a related (maybe controversial?) note I take some issue with the Condorcet criterion. I also have noticed that ElectoWiki doesn’t seem to be very objective about it. While a Condorcet winner has the majority support of the electorate over any other candidate in a pairwise face-off, the majority groups that support the winner from different face-offs can differ from each other dramatically.

In other words, I would say that there is no guaranteed stable locus of electoral consent for a Condorcet winner—it is rather like a stitching together of victories in various unrelated and somewhat gamified competitions, and to me this makes the Condorcet paradox less of a paradox. In line with the concluding remarks of the paper I think it’s not at all obvious or necessarily correct that the Condorcet winner is the ideal choice even when one exists.

cfrank

@brozai I also wanted to address your point about majority rule. If you are referring to May's Theorem, I think it's important to consider the scope of the proof. The theorem is proved assuming that voters can indicate one of only three options, -1, 0, or +1. The formal properties don't fully make sense beyond that scope.

@cfrank well, this is true, but I think the monotonicity condition means that, with strategic agents, it has a natural extension to score ballots

cfrank

@brozai I would need to see it formally. I can see what you mean about strategic agents, but I don’t see any extensions mentioned on Wikipedia but a similar statement (whatever that means) for approval voting (fully strategic score voting), and then some for other first-preference aggregators which would have us using plurality voting. I personally doubt a useful extension to generic score voting exists, because of the nature of the preferential ambiguity between a strong majoritarian assertion and a weaker but broader consensus.