Why ratings are stupid to worry about

discRabbit · Sep 2, 2015

chris deitzel said:
Better system? How about no ratings?

And what are the positive/negative consequences of the "no ratings" system?

discRabbit · Sep 2, 2015

geoloseth said:
So then how does and intangible like pressure get a numerical value in an equation? Does a player experience more/less pressure if they only get to play 5 tournaments a year vs. the next guy on the card that plays 50 events in that same time frame?

So called "intangibles" are not criterion for estimation. Only raw scores and prior estimates of player ratings (propagators) matter. The current system could care less if a player experiences more or less pressure, it considers only the resultant outcome, raw score.

discRabbit · Sep 2, 2015

Ooops, missed that Chuck already responded to this.

Summary: Ratings do not care if you a) have the sniffles, b) a broken foot, c) your goldfish died, d) you threw OB multiple times, etc.

What you contribute is your RAW SCORE. Fewer strokes relative to others, controlling for previous estimated ratings, is all that matters. How is this not the best possible system for opening up registration at limited tournaments?

Does the system need a small tweak or two? I think so. However, on the whole, it has a strong statistical position even as it is currently implemented.

Consider registration for an Open-division tournament that will sell out in less than a minute, what factor should be primary in determining who can register?
1 - PDGA rating
2 - Speed of internet connection
3 - Economics (who is willing to pay a higher entry fee)
4 - Friends with the TD/organizers
5 - Lottery system (edit to add)
6 - Other options (help me out

Given these options, I'll take 1 (saying this as a low-end rated open player - i.e., I propose to you that this is not a self-interested position). Remember that you have to must choose at least one factor. "Leaving it up to chance" is a vote for #2, you cannot be "neutral"

bwgort · Sep 2, 2015

cefire said:
Ooops, missed that Chuck already responded to this.

Summary: Ratings do not care if you a) have the sniffles, b) a broken foot, c) your goldfish died, d) you threw OB multiple times, etc.

What you contribute is your RAW SCORE. Fewer strokes relative to others, controlling for previous estimated ratings, is all that matters. How is this not the best possible system for opening up registration at limited tournaments?

Does the system need a small tweak or two? I think so. However, on the whole, it has a strong statistical position even as it is currently implemented.

Consider registration for an Open-division tournament that will sell out in less than a minute, what factor should be primary in determining who can register?
1 - PDGA rating
2 - Speed of internet connection
3 - Economics (who is willing to pay a higher entry fee)
4 - Friends with the TD/organizers
5 - Lottery system (edit to add)
6 - Other options (help me out

Given these options, I'll take 1 (saying this as a low-end rated open player - i.e., I propose to you that this is not a self-interested position). Remember that you have to must choose at least one factor. "Leaving it up to chance" is a vote for #2, you cannot be "neutral"

I think you forgot some options:

Touring Pro Cards, PDGA Points, Qualification system similar to the Vibram Open.

discRabbit · Sep 2, 2015

bwgort said:
I think you forgot some options:

Touring Pro Cards, PDGA Points, Qualification system similar to the Vibram Open.

Thank you sir :hfive:

7 - Touring Pro Cards
8 - PDGA points
9 - Pre-tournament Quals (e.g., Vibram, USDGC)

scarpfish · Sep 2, 2015

cefire said:
Consider registration for an Open-division tournament that will sell out in less than a minute, what factor should be primary in determining who can register?
1 - PDGA rating
2 - Speed of internet connection
3 - Economics (who is willing to pay a higher entry fee)
4 - Friends with the TD/organizers
5 - Lottery system (edit to add)

The PDGA points system with perhaps average rating over the last 12 months as a tiebreaker. Adding in the suggestion that certain events use an alternate calendar year to count points (say for the 2016 Memorial we count points from October 2014-September 2015).

discRabbit · Sep 2, 2015

scarpfish said:
The PDGA points system with perhaps average rating over the last 12 months as a tiebreaker. Adding in the suggestion that certain events use an alternate calendar year to count points (say for the 2016 Memorial we count points from October 2014-September 2015).

Sounds like a very reasonable system to me (and average rating is going to be correlated with reported PDGA rating at something like .8-1.0). But, what if Ron Russell wants in?

Major Tomahawk · Sep 2, 2015

RobA said:
I am moving up to open master/grandmaster next year anyway so it is non-issue for me.

Sorry man, there are no divisions called open masters or open grandmasters. Sorry to be a stickler. The mistake to call those divisions "open" is way too common and continuously perpetuated. :wall:

I hear Am2 all of the time around here as well. I guess this should be in the 'pet peeve' thread.

biscoe · Sep 2, 2015

1. tour card holders (with highly upgraded requirements to hold one)
2. mullets with the fastest internet

of course, holding a tour card would be based on a combo of rating and a minimum number of events played. Ron would be out of luck unless world champs were grandfathered in.

biscoe · Sep 2, 2015

Major Tomahawk said:
Sorry man, there are no divisions called open masters or open grandmasters. Sorry to be a stickler. The mistake to call those divisions "open" is way too common and continuously perpetuated. :wall:

I hear Am2 all of the time around here as well. I guess this should be in the 'pet peeve' thread.

I agree with you on the open thing. However calling intermediate "am 2" does not get my shorts in a bunch. pdga code for it is MA2 after all.

discRabbit · Sep 2, 2015

Mullet required?

If so, I'm out. Wife > Disc golf

geoloseth · Sep 2, 2015

Cgkdisc said:
Not true. The average rating of propagators does not matter as long as they are playing in the same conditions. However, you only see top rated players in higher tier events with more tournament pressure which will produce a higher rating for the same score. If you're a local who is not affected by tournament pressure and you're able to shoot as well in a local high tier tournament as in your rec rounds on those courses, then you'll get a higher rating for the same score. But the stats show otherwise, i.e. locals choke up under pressure just like the top rated pros. In fact, they choke up even more than the top pros who are more used to the pressure. That's one factor that produces their higher ratings.

Because I want to make sure I know exactly what you mean - how does that work out if it's the same score but now rated higher and the pressure aspect doesn't have a numerical value in your calculations?

Here's a thought experiment:
Take hypothetical course XYZ. A tournament event is played there comprised of 20 players with an average rating 940. The highest player rating is 970 and the lowest is 900. Since it is a small event there is not much pressure on the players. There ends up being a 10 throw difference between the best score and the worst score. As Chuck explained: the Average In = the Average Out. This would mean that the round rating of all 20 rounds would average 940.

Take the same course 6 months later under nearly identical conditions. This time it's an invitational event with a $10,000 1st place all or nothing prize. 20 top pros are invited to play. The highest rated is 1050 and the lowest rated is 980. Their average rating is 1020. Once all scores are in there is a 10 throw difference between the best score and the worst score. These scores happen to be the exact same as in the scenario above. If the Average In = the Average Out then the average round rating of this group would have to be 1020.

The SSA of the course hasn't changed. If pressure is not a value in the equation then how can these two rounds be 80 pts different? Are the higher rated players awarded higher rated rounds because they are rated higher coming in? If both pools of players were combined into the later event would the round ratings stay the same or would they then drop to a lower average to account for the lower rated players bringing down the Average In?

Even though it's a hypothetical experiment that would 99.999% never happen it's just a thought experiment to test the 'fairness' of how a rating is determined since the ratings system doesn't care if you have the sniffles, your gold fish died, or your throwing arm was ripped off halfway through your round.

To the point of the OP this is why ratings aren't worth worrying about.

Cgkdisc · Sep 2, 2015

There's no such thing as a fixed SSA. It's always situational as determined by the scores of the propagators. Propagators are "measuring sticks" that indicate their measurement of how difficult the course played that round by the score they shot. Some shoot better, more the same and some worse than their skill level. But on average, they shoot their rating within a standard deviation that gets smaller the more propagators are indicating their measurement by their score.

discRabbit · Sep 2, 2015

geoloseth said:
Because I want to make sure I know exactly what you mean - how does that work out if it's the same score but now rated higher and the pressure aspect doesn't have a numerical value in your calculations?

Here's a thought experiment:
Take hypothetical course XYZ. A tournament event is played there comprised of 20 players with an average rating 940. The highest player rating is 970 and the lowest is 900. Since it is a small event there is not much pressure on the players. There ends up being a 10 throw difference between the best score and the worst score. As Chuck explained: the Average In = the Average Out. This would mean that the round rating of all 20 rounds would average 940.

Take the same course 6 months later under nearly identical conditions. This time it's an invitational event with a $10,000 1st place all or nothing prize. 20 top pros are invited to play. The highest rated is 1050 and the lowest rated is 980. Their average rating is 1020. Once all scores are in there is a 10 throw difference between the best score and the worst score. These scores happen to be the exact same as in the scenario above. If the Average In = the Average Out then the average round rating of this group would have to be 1020.

The SSA of the course hasn't changed. If pressure is not a value in the equation then how can these two rounds be 80 pts different? Are the higher rated players awarded higher rated rounds because they are rated higher coming in? If both pools of players were combined into the later event would the round ratings stay the same or would they then drop to a lower average to account for the lower rated players bringing down the Average In?

Even though it's a hypothetical experiment that would 99.999% never happen it's just a thought experiment to test the 'fairness' of how a rating is determined since the ratings system doesn't care if you have the sniffles, your gold fish died, or your throwing arm was ripped off halfway through your round.

To the point of the OP this is why ratings aren't worth worrying about.

The rounds are 80 points different because your thought experiment a priori defined the rounds as 80 points different. In this hypothetical situation (that would "never happen") it was a foregone conclusion that you would observe 80 points of difference.

Also, a 10 shot difference in best vs. worst scores does not mean anything unless you know what the prior rating of those who shot the best versus worst rounds. There will be strong covariance between prior ratings and round scores (because players who shot better scores in the past will tend to shoot better scores currently - this is a rather trivial assumption). Your IN=OUT conception fails to account for this covariance while the actual calculations of ratings necessarily account for this when adding in player's prior ratings.

One step further, these kind of hypotheticals are not useful in evaluating the system, you must work with actual or simulated data to test the influence of prior player ratings/hypothetical conditions.

Though, this is a good argument for PDGA opening up the vaults on the ratings algorithm and datasets.

discRabbit · Sep 2, 2015

A simplified meta-thought experiment to further examine the first point I bring up above:

Feldberg (1040-rated) shoots a 43 on course XYZ on day 1 (rated 1040)
My dad (820-rated) shoots a 43 on course XYZ on day 2 (rated 820)

Either, 1) the ratings system is flawed, 2) the assumptions of the thought experiment are unreasonable/unwarranted, or 3) Feldbergs goldfish (lil' Borg) died

Imma go with 2

geoloseth · Sep 2, 2015

Cgkdisc said:
There's no such thing as a fixed SSA. It's always situational as determined by the scores of the propagators.

As the experiment suggested - the course and conditions had not changed between the 6 months from the first event to the second event. Wouldn't then the SSA be the same for both rounds? As it's defined on the PDGA site it's what a 1000 rated player would average on that course. Would the SSA change that dramatically from the first group to the next given these conditions?

cefire said:
The rounds are 80 points different because your thought experiment a priori defined the rounds as 80 points different. In this hypothetical situation (that would "never happen") it was a foregone conclusion that you would observe 80 points of difference.

Also, a 10 shot difference in best vs. worst scores does not mean anything unless you know what the prior rating of those who shot the best versus worst rounds. There will be strong covariance between prior ratings and round scores (because players who shot better scores in the past will tend to shoot better scores currently - this is a rather trivial assumption). Your IN=OUT conception fails to account for this covariance while the actual calculations of ratings necessarily account for this when adding in player's prior ratings.

One step further, these kind of hypotheticals are not useful in evaluating the system, you must work with actual or simulated data to test the influence of prior player ratings/hypothetical conditions.

Though, this is a good argument for PDGA opening up the vaults on the ratings algorithm and datasets.

I wasn't trying to assume that the rounds must be 80 points different, just trying to determine if that correlation was correct. Again, IN=OUT is how Chuck explained it to me in the past.

And it would seem true that if the better players shot the better scores then the rating spread per stroke would be higher than if the players were reversed with the lower rated shooting the better scores.

But how can these kind of thought experiments not be useful/valid - it's just simulated data. If it needed to be more grounded in actual prior data then you can plug in anyone from the PDGA player data base that matches the needed criteria.

discRabbit · Sep 2, 2015

geoloseth said:
it's just simulated data. If it needed to be more grounded in actual prior data then you can plug in anyone from the PDGA player data base that matches the needed criteria.

True, it could be thought of as simulated data. But based on a simulation that has potentially (almost certainly) no basis in reality. Which is why it does need to be more grounded, as you suggest, in actual prior data or data that were generated with a model that is known to accurately reflect reality.

Cgkdisc · Sep 2, 2015

This doesn't require a thought experiment. The typical propagator will shoot more than 3 throws better than their rating about 1 in 6 rounds. That's also true in the opposite direction. That means 2/3 of their rounds are within +/- 3 throws on a 50 SSA course. If we only have 5 props say for a league round, the odds they will all 5 shoot more than 3 throws worse than their rating is 1 in 7776 rounds. So there's a pretty good chance the variance in the SSA produced with just 5 props will be less than +/-3 from the "real" value. And of course it narrows even more with many more props. So when you see a variance of say 30 rating points between numbers that have a lot of props behind them, it's more likely there's a physical reason behind it (i.e. not the same conditions) than normal statistical variance of the props and the consistent calculation formulas that have been used over 15 years.

geoloseth · Sep 2, 2015

cefire said:
A simplified meta-thought experiment to further examine the first point I bring up above:

Feldberg (1040-rated) shoots a 43 on course XYZ on day 1 (rated 1040)
My dad (820-rated) shoots a 43 on course XYZ on day 2 (rated 820)

Either, 1) the ratings system is flawed, 2) the assumptions of the thought experiment are unreasonable/unwarranted, or 3) Feldbergs goldfish (lil' Borg) died

Imma go with 2

Should a mathematical formula(s) validity be determined on the actual likelihood of the data set being provided? Or should the extreme ends of data possibilities be used to determine whether or not a formula allows for discrepancies.

discRabbit · Sep 2, 2015

Yes, this is my point. You cannot "think up" a scenario because you have no evidence for its plausibility (as in the silly scenario I just generated). A model is constructive to the extent it can explain the available data, independent of its ability to fit hypotheticals.

Thus, use actual data or known simulations. Intuitions, hunches, and hypotheticals aren't productive avenues for model evaluation.

Theme font size

Why ratings are stupid to worry about

Eagle Member

Eagle Member

Eagle Member

Birdie Member

Eagle Member

Resident Grouch

Eagle Member

Eagle Member

* Ace Member *

* Ace Member *

Eagle Member

Birdie Member

.:Hall of Fame Member:.

Eagle Member

Eagle Member

Birdie Member

Eagle Member

.:Hall of Fame Member:.

Birdie Member

Eagle Member

Similar threads