Replace ELO with Win/Loss ration

### Replace ELO with Win/Loss ration

The "How to preserve LELO's" thread has spawned a separate conversation. The weakness of ELO.

Problem: ELO is cumulative. If you win 51% of your games, you will have an ELO that climbs. Ultimately to any level given enough games. I'm not going to fully explain it, there is a significant amount of information on ELO.

Solution: Track and use Winning percentage use it, and a comparison to an opponents win/loss to handicap games the way we do currently with ELO.

Corollary proposals: Display Land won and total games played. Use strength of schedule.

I'll work this out more later. The guys I've been talking to can add to this.

### Re: Replace ELO with Win/Loss ration

This is not true. If you win 51% of your games, and we assume for simplicity that you fight people at an average of 1600 Elo, then your Elo will naturally stop at 1607. At that point, instead of gaining 2 Elo per win and losing 2 per loss, you start gaining 1.96 per win and losing 2.04 per loss, for an expected value of 1.96*51%+(-2.04)*49% = +0.9996-0.9996 = 0. Your Elo will not change further over time.Nastyogre wrote:Problem: ELO is cumulative. If you win 51% of your games, you will have an ELO that climbs. Ultimately to any level given enough games.

Now, Elo does have weaknesses in comparing people in different time periods - chess has had an Elo inflation problem for a while, and we can even see it on our server to some extent(a 1650 Elo a week after cycle reset is far different than a 1650 Elo six months in) - but at any given time, Elo is a legitimate, statistically valid measure of player quality.

You're basically trying to remake the Elo system here, only without as much detail or statistical power. There's no shame in that - Arpad Elo was a statistics professor and a chess master, he would of course have a much better understanding of this stuff than you or I would. But this is not a good wheel to re-invent.Nastyogre wrote:Solution: Track and use Winning percentage use it, and a comparison to an opponents win/loss to handicap games the way we do currently with ELO.

Corollary proposals: Display Land won and total games played. Use strength of schedule.

The problem you're trying to solve is players wanting to lose games to drop their Elo, not the statistical measure itself. If losing is sometimes good and sometimes bad, then no statistical measure on the planet will measure player quality well. Specifically, the concern is players losing meaningless games to drop their Elo, and winning meaningful ones to reap the benefits of being a low-Elo player.

If you want a simpler solution than the ones we discussed on the other thread, just scale the k-value by the meaningfulness of the game. Instead of a flat k=4 that the server uses now, make it k=land swing/10 or something(where "land swing" = land attacker would gain for winning + land defender would gain for winning, to compensate for the 0% planet problem). That way if you're rocking out with a heavy company battle on a capital planet, and the swing is +50 CP for the winning player, it'll have way more of an impact on your Elo than some dinky light aero battle on a 0% world where the winner will only get 4 CP at best. So yeah, you can intentionally lose 25 of those aero battles to make up for the one capital fight if you really want to, but neither you nor your faction will gain anything from the process with regards to Elo modifiers, net CP, or anything else(besides the usual rewards of activity, of course, but nobody objects to active players earning RP and C-bills).

### Re: Replace ELO with Win/Loss ration

You are right, I'm oversimplifying. But Elo inflation is a problem. Elo is very accurate with similar numbers of games. It's not when you have lots more game. Thus Baruk has a higher ELO than Chaser despite Chaser being more successful. Correct me if I'm wrong, but that is due to ELO inflation.

### Re: Replace ELO with Win/Loss ration

The reason why Chaser and Baruk have similar Elos despite very different player skill levels is that the k-value is too low. At k=4, which is what we have, an 80% player's rating will move by less than a point per game on average after only a few dozen games, despite being 200+ points below their proper level. It'd take 350+ games to get within 50 points of where they should be. That's way too long.

I made some simplifying assumptions(all games vs 1600-Elo opponent, averaging scores out so all games earn the mean score instead of varying wins and losses) and ran up an Excel sheet.

60% player "natural Elo" = 1672

60% player, 50 games, k=4 -> Elo 1617, improving 0.3 points/game

60% player, 50 games, k=8 -> Elo 1631, improving 0.45 points/game

60% player, 50 games, k=16 -> Elo 1648, improving 0.52 points/game

60% player, 50 games, k=32 -> Elo 1663, improving 0.33 points/game

80% player "natural Elo" = 1841

80% player, 50 games, k=4 -> Elo 1652, improving 0.91 points/game

80% player, 50 games, k=8 -> Elo 1692, improving 1.38 points/game

80% player, 50 games, k=16 -> Elo 1749, improving 1.66 points/game

80% player, 50 games, k=32 -> Elo 1800, improving 1.34 points/game

(The movement per game drops at k=32 because their rating is so much closer to natural)

IMO, k=16 would be about right. There'd be almost a hundred-point difference between Baruk and Chaser if we had that, instead of them looking equally strong, but a few games won't swing anyone too much. (If the variable-k suggestion I made above was used, then this would be a suggested average, of course).

### Re: Replace ELO with Win/Loss ration

So why wouldn't we want something closer to natural? Is that because it's more subject to Elo inflation?

### Re: Replace ELO with Win/Loss ration

Inflation is mostly a function of new players who are worse than the established vets being a natural Elo source - if the average newb is really a 1550, but they come in at 1600, every newb donates 50 points to the vet pool. Later in cycle = more newbs have joined = more Elo has flowed upwards. Like I said, it's a problem when comparing ratings over time, but at any given time Elo is perfectly good.

### Re: Replace ELO with Win/Loss ration

Our K=8, not 4.Alsadius wrote:The reason why Chaser and Baruk have similar Elos despite very different player skill levels is that the k-value is too low. At k=4, which is what we have, an 80% player's rating will move by less than a point per game on average after only a few dozen games, despite being 200+ points below their proper level. It'd take 350+ games to get within 50 points of where they should be. That's way too long.

### Re: Replace ELO with Win/Loss ration

Is ELO creep just accountable to "donated" ELO when s new player isn't properly ranked?

### Re: Replace ELO with Win/Loss ration

Ah, sorry. Good to know. I thought it was 4 from some previous threads I dug up on the topic(specifically, #9 in this thread - perhaps I mistook what was meant there).Spork wrote:Our K=8, not 4.

That was my understanding, yes. A plain vanilla Elo system will never add or remove points from the system as a whole due to games played, it'll only redistribute them. The only source of new points is new players, and the way for existing players to grow their rankings over time is to take points from those new players. The average of the player pool(if you include the players who have left) will always be 1600, it's just how those points are divided that changes.Nastyogre wrote:Is ELO creep just accountable to "donated" ELO when s new player isn't properly ranked?

### Re: Replace ELO with Win/Loss ration

Alsadius wrote:Ah, sorry. Good to know. I thought it was 4 from some previous threads I dug up on the topic(specifically, #9 in this thread - perhaps I mistook what was meant there).Spork wrote:Our K=8, not 4.

That was my understanding, yes. A plain vanilla Elo system will never add or remove points from the system as a whole due to games played, it'll only redistribute them. The only source of new points is new players, and the way for existing players to grow their rankings over time is to take points from those new players. The average of the player pool(if you include the players who have left) will always be 1600, it's just how those points are divided that changes.Nastyogre wrote:Is ELO creep just accountable to "donated" ELO when s new player isn't properly ranked?

The ELO exponent to 4 is the slanting of outcomes. A lower exponent there means a more "pure" payout of land and cbills.