No CPU, No Photocell

Sass

@jack-waugh Here's the process I wrote for NUTIC:

Preference matrices aren’t difficult either. There are three different ways to make them, so poll workers can use whichever they are most comfortable with. The most efficient procedure is to have the poll worker simply add one for each pairwise win on a ballot in the corresponding box in a matrix. This takes a bit of training, but for those with a sharp mind, it will likely be the preferred procedure. A middle approach is to list each pair of candidates and have the poll worker go through each ballot and add one to the winner between each pair, if there is one. The longest way that is probably best for recounts is to pick one pair of candidates and sort each ballot into one of three piles: one pile for the first picked candidate’s pairwise wins, the second pile for the second picked candidate’s pairwise wins, and the third pile for ballots that show no preference between the two picked candidates; then, count how many ballots are in each pile.

For STAR specifically, the question is are we making an entire preference matrix or just finding the winner based on the ballots we have? For the latter, it's two rounds of (hand) counting. First, go through and add up scores. Then, declare your 2 finalists. Then, regather the ballots and go through them one-by-one putting them each into one of 3 piles: one pile for ballots preferring the first finalist, one pile for ballots preferring the second finalist, and one pile for ballots showing No Preference between the finalists. The finalist with the biggest pile is elected.

Adding up scores (by hand) would actually probably be the longest portion, especially if the order of candidates on those ballots is random.

Full preference matrices for sure take longer, but the more counters you have, the more you can creatively break up the work in efficient ways. For example, you could have one counter for each candidate plus 1 or 2 counters managing the stack of ballots. The ballot counter looks at one ballot and says out loud "Ava beats everybody. Bianca beats everyone except Ava. No preference between Cedric and Deegan, but they both beat Eli. Eli loses to everyone." Then the candidate-assigned counters record their respective candidate's tallies for each matchup on their own sheet. This also provides redundancy for each matchup so no one can get away with fudging numbers for a particular matchup.

I don't have a number of staff seconds, but I don't think this is the biggest burden. As you noted, people really care about election security. If a community really wants to hand count, I don't think this would be a deterrent.

Jack Waugh

@sass, are you not the one who argued that a deal breaker with IRV is that it requires the cast-vote records assembled in one place for the tally, and there are too many opportunities for mistakes or fraud in sending those from the precincts?

Jack Waugh

@sass said in No CPU, No Photocell:

have the poll worker simply add one for each pairwise win on a ballot in the corresponding box in a matrix.

Adding one on paper pretty much requires that instead of Arabic numerals, there are tally marks. So, for a matrix sheet to last very long, it needs maybe 25 sq cm for each cell. So each row or column takes 5 cm. We have 20 candidates across the top, so that's a meter. And you can't subtract in one square, so you need both the upper and lower triangles of the matrix. I made a 5cm x 5cm square and put tally marks across the top and down the side and I calculate that I can get 140 in the square. So that's about how many ballots the workers could process before filling up the sheet and moving on to a new sheet.

Jack Waugh

@sass said in No CPU, No Photocell:

pick one pair of candidates and sort each ballot into one of three piles: one pile for the first picked candidate’s pairwise wins, the second pile for the second picked candidate’s pairwise wins, and the third pile for ballots that show no preference between the two picked candidates; then, count how many ballots are in each pile.

I believe this is fastest and least error-prone, because machines can be used to count the ballots in each pile. A single witness, whose name I forget, writing on antisocial media, said that people make mistakes when counting a pile by eye and hand, and the machines don't.

But the time for this procedure grows as O(N^2) vs. O(N) for Score, where N is the count of candidates.

rob

@jack-waugh As long as the raw ballot data is released to the public, Score is no better than STAR or Condorcet in this respect.

If you release that ballot data to the public so anyone (news organizations, people like us, etc) can run the tabulations themselves, this problem is solved. While malicious software could somehow modify the data before that, Score is no more immune to this than any other method.

But you seem to be talking about fraud happening at a later stage, that is, when doing the actual tabulation: converting the ballots into a single winner.

If the ballot data is published, it would quickly be noticed if there was fraud happening at the stage you are speaking of. It is highly unlikely anyone is going to do what you suggest -- at great technical challenges, expense and risk -- if it is so easy to catch.

As to how this can be released, it could be truly "raw," like the below 40 meg file from the 2019 San Francisco mayor ranked choice election (where a few hundred thousand people voted). https://pianop.ly/ballotData/SFMayor2018/rawtextfile.txt

Better yet, it can be condensed into something much easier to work with, like this: https://pianop.ly/ballotData/SFMayor2018/data.js In an ideal world, both would be released.

(shown below with about 900 lines removed in the middle)

SFMayor2018Data = {
    candidates: {
     a: 'Michelle Bravo',
     b: 'Jeff Sheehy',
     c: 'London Breed',
     d: 'Lawrence Stark Dagesse',
     e: 'Mark Leno',
     f: 'Rafael Mandelman',
     g: 'Jane Kim',
     h: 'Richie Greenberg',
     i: 'Angela Alioto',
     j: 'Amy Farah Weiss',
     k: 'Ellen Lee Zhou',
     l: 'Antoine R. Rogers'
    },
   
    ballots: [
     { count: 13162, ranks: 'gej' },
     { count: 13039, ranks: 'c' },
     { count: 9996, ranks: 'ceg' },
     { count: 9547, ranks: 'cge' },
     { count: 9053, ranks: 'gec' },
     { count: 8793, ranks: 'egc' },
     { count: 6056, ranks: 'ge' },
     { count: 6024, ranks: 'ecg' },
     { count: 5508, ranks: 'fbd' },
     { count: 5356, ranks: 'eg' },
     { count: 5193, ranks: 'cei' }, 

/* removed about 900 lines here */

     { count: 2, ranks: 'dcf' },
     { count: 2, ranks: 'dkb' },
     { count: 2, ranks: 'deb' },
     { count: 2, ranks: 'dii' },
     { count: 2, ranks: 'lca' },
     { count: 2, ranks: 'gel' },
     { count: 2, ranks: 'hl' },
     { count: 2, ranks: 'efi' },
     { count: 2, ranks: 'idd' },
     { count: 2, ranks: 'clk' },
     { count: 2, ranks: 'fgl' }
    ]
   
   };

Even if they only release it as they did (in an unwieldy 40 meg file), that should be good enough to make sure that if someone tries to cheat the election the way you describe, it will be instantly revealed. Sure, it will typically take a CPU rather than something you can do by hand, but the important thing is that it can be independently verified, not that electronics are not involved.

So much easier than a hand recount anyway.

(my converter of the raw data to the condensed data is here: https://pianop.ly/ballotData/SFMayor2018/converter.js

Jack Waugh

I should have put this topic under "Election integrity and security". I don't know why sometimes I don't seem to use all the mental capacity I think I was born with.

If the cast-vote records are published, it would create an opportunity for coercion. Someone could threaten someone if they don't vote a specific way. The demand would include not only the person's vote in the race the coercer cares about, but also a selection of downballot votes that the coercer assigns to just that one coerced person, but that aren't that likely to be anyone's voluntary votes. Then if no ballot appears in the published records meeting the exact requirements, the coercer knows the coerced didn't comply, and can enact the threat.

rob

@jack-waugh said in No CPU, No Photocell:

If the cast-vote records are published, it would create an opportunity for coercion

I am obviously not suggesting that the identities of the voters be revealed. Take a look at the San Francisco mayor data I showed. Either in it's original form, or after I processed it.... either is fine.

In this case it was an IRV election, and voters could only rank three out of a field of 12. This is a bit of what the processed data looks like:

 ballots: [
     { count: 13162, ranks: 'gej' },
     { count: 13039, ranks: 'c' },
     { count: 9996, ranks: 'ceg' },
     { count: 9547, ranks: 'cge' },
    ...
    ]

There is no personally identifying data in there. It is also lacking the precinct identifying data, but the original source had this. The important point is that there is enough information to re-do the tabulation step, which anyone at home can do.

And this information is typically already supplied today in any ranked choice election and I think we can assume it would supplied be in any Score, STAR or Condorcet election as well.

I'm just saying that as long as this information is supplied, the sort of fraud you are concerned about is easy to check for by anyone who wants to bother. It would be caught as quickly as the faulty tabulation was caught when voting for the domain name for this forum last year.

Marylander

@jack-waugh In general I think if you make people do data entry for several hours a day they'll probably make the occasional mistake. The best ways I can think to avoid it would probably be to have more than one person looking over each count (since they're unlikely to make the same mistake) and to have some sort of tally procedure that makes it simple to check for errors. (With something like finding pairwise matrices for Condorcet, keeping a Borda count as well might be useful for catching errors, for example.)

@jack-waugh said in No CPU, No Photocell:

Adding one on paper pretty much requires that instead of Arabic numerals, there are tally marks. So, for a matrix sheet to last very long, it needs maybe 25 sq cm for each cell. So each row or column takes 5 cm. We have 20 candidates across the top, so that's a meter. And you can't subtract in one square, so you need both the upper and lower triangles of the matrix. I made a 5cm x 5cm square and put tally marks across the top and down the side and I calculate that I can get 140 in the square. So that's about how many ballots the workers could process before filling up the sheet and moving on to a new sheet.

I think this is fine. If someone makes a counting mistake, and we are using tools to safeguard against them, then having broken the ballots into smaller groups will make it easier to identify where the mistake was.

@jack-waugh said in No CPU, No Photocell:

I should have put this topic under "Election integrity and security". I don't know why sometimes I don't seem to use all the mental capacity I think I was born with.

If you want, I can move it.

Jack Waugh

@rob said in No CPU, No Photocell:

{ count: 13162, ranks: 'gej' },

The tabulators would have to sort the ballots by their response for each race. Can this be done in a reasonable time, for a system that allows equal ranking?

rob

@jack-waugh said in No CPU, No Photocell:

The tabulators would have to sort the ballots by their response for each race. Can this be done in a reasonable time, for a system that allows equal ranking?

I wasn't suggesting hand tabulators put it into this format, or even that there be hand tabulation at all.

I am simply suggesting that the "raw" data -- essentially the description of all the ballots -- be available to the public after the election so that members of the public (reporters, bloggers/tweeters with some tech skills, etc) can download and process it, so they can make sure the later stages of tabulation are done correctly. (as well as to do other types of analysis on the data, including making pretty graphs/visualizers)

Remember, you were saying that Score makes it harder to cheat by computer hacking (compared with, for instance, STAR or Condorcet methods), basing this on the idea that the later stages of tabulation (the actual calculations) are simpler with Score. I'm saying the later stages don't need to be simple, as long as they can easily be verified by the public.

As for the format I showed (which gives a count for each of the ballot possibilities, rather than just listing all ballots), that just makes it more compact so it is easier to download and process. If they are cardinal ballots, or they allow full rankings and possibly equal rankings, it would make it bulkier but still pretty compact compared to a full list.