• Discover new ways to elevate your game with the updated DGCourseReview app!
    It's entirely free and enhanced with features shaped by user feedback to ensure your best experience on the course. (App Store or Google Play)

Combined ratings on 2 day events.

Allurex

* Ace Member *
Joined
Jan 5, 2012
Messages
2,048
Location
Kansas City
Here in KC, Dynamic Discs runs a ton of events (B-Tiers) and this year they decided to run them on 2 days instead of one. So the Pro Men, Pro Women, Masters, and Advanced players play on Saturday, and the lower AM Divisions, Grandmasters, and Women AMs play on another day, for example.

They've had great success with it, allowing for more than 72/90 players to sign up, and the course isn't nearly as congested.

I noticed on the most recent ratings update that the PDGA went ahead and combined the ratings for the 2 days, even though they were separate PDGA events, technically. Note how the SMP Frostbreaker was listed-

Frostbreaker Day 1 (MA2, MA3, MG1, FA2) - Link
Frostbreaker Day 2 (MPO, MPM, MPG, FPO, MA1, FA1, MM1) - Link

So the ratings were combined together for the 2 days (for example, a 60 was rated 927 for both days). Same course set-up, same tees and all of that.

The only problem was that it snowed 2 inches during the Sunday edition of the tournament, so the conditions were not nearly the same on both days.

When the ratings were unofficial, the ratings for Sunday were about 25 points higher than similar scores on Saturday. Many players, myself included, figured this was 100% correct, because of how much harder the conditions were on Sunday. Shooting a 60 was much easier on Saturday because of the better weather. So why combine the ratings?

They've done this for other events, but I think the average event doesn't have such drastically different weather conditions from 1 day to the next. When I asked the TD about this, he said that he just submitted the reports and the PDGA combined the ratings themselves, and that he didn't really see anything to be done about it.

Are ratings supposed to be combined like this? I feel like different pools playing events on the same courses shouldn't have combined ratings if the conditions are so drastically different.

If this was done in error, is there someone in the PDGA that I can contact? Is there a way to get it fixed?
 
Multiple rounds from the same course on the same weekend have been combined for ratings purposes for a number of years now. If the delta in the SSA between these before being combined is too large, then they will be rated independently. I have yet to see one where that was the case though (not that they don't exist).
 
also if the td feels the conditions were vastly different they need to note it in the event report in which case the rounds would not be combined.
 
There's a specific set of automated guidelines that determines when rounds are combined or not that's based on the number of propagators producing an SSA and how much different the SSAs are. As Krupicka indicates, you normally see the rounds on the same layout combined even in different events. But if you look at R2 of the Memorial you'll see an example where each pro division got a separate rating for the same score at Fountain and those were different from the ratings they all got on R4 Saturday at the Fountain. You'll soon see what wild separate numbers will be produced for TX States in the April 22nd update where we might have 6 or 7 different ratings for the same score in different divisions over that 3-day event.

As Biscoe noted just above this post, the TD can indicate the different weather conditions but it will only matter if Roger sees the numbers are right on the edge of being combined or not.
 
I don't have a specific two event weekend example I can remember without asking Roger to research it. But we have all kinds of examples of different propagator pools playing the same course on different days and getting all of the rounds averaged together. Pretty much every Worlds is done that way unless there are specific days that we know to check and see if it should be broken out. Like I said, the process to combine/separate is primarily automatic now so we don't have to rely on TD and player reports.
 
Why is it done at all?

I cannot come up with a single good reason for it.

And Allurex just pointed out a very good reason not to.

Open source the rating calculations. Now.
 
Why is it done at all?

I cannot come up with a single good reason for it.

And Allurex just pointed out a very good reason not to.

Open source the rating calculations. Now.

I would guess one reason would be that the more numbers you average together, the more reliable the average. Combining rounds makes each round less susceptible to variation from individual hot or cold rounds.

Then again, enough are combined and averaged when producing player ratings that it hardly matters either way. And producing player ratings is the main point, not the sanctity of individual round ratings.
 
Multiple rounds are combined because they theoretically should provide more accurate ratings with more data points. IMO I think the threshold for determining if two rounds should be combined needs to be adjusted. Some rounds are being combined that probably shouldn't be.

The ratings calculation information is all out there if you've been paying attention over the years. The ratings system is a big selling point for memberships. It is wise for the PDGA to protect their intellectual property.
 
If the two combined rounds have the same pool of players the effects of combining the rounds is a wash on the player ratings. The benefit of combining rounds is increased when the combined rounds are for two different pools of players.
 
Two reasons
1. No way to know which round SSA number is more accurate so you average them to theoretically get a more accurate number.

2. Perception of fairness not necessarily actual fairness. Players feel it's more fair when they perceive the course playing conditions are the same. Thus, all players should get the same rating for the same score on the same layout.
 
I get that the larger sample size created by combining rounds tends to make outlier scores less statistically significant, but when conditions are substantially different from day to day (or even round to round), I think the TD should be able to indicate such when submitting the scores to to ensure the rounds are not combined. Seems that would yield round ratings that are more indicative of how well a given round was actually played... isn't that the purpose of round ratings?

Admittedly, "substantially different" is open to interpretation by the TD, but many PDGA rules effectively leave things to the TD's judgement. If the TD thinks things are so different from one round to another, they should be able to state such to maintain the integrity of their event.
 
Seems that would yield round ratings that are more indicative of how well a given round was actually played... isn't that the purpose of round ratings?

The main purpose it to produce accurate player ratings, not individual round ratings.

The player ratings determine which division Ams are eligible for.

If the individual round ratings aren't precise, little harm is done because, by the time they get averaged with your other round ratings, they'll balance out.
 
I get that the larger sample size created by combining rounds tends to make outlier scores less statistically significant, but when conditions are substantially different from day to day (or even round to round), I think the TD should be able to indicate such when submitting the scores to to ensure the rounds are not combined. Seems that would yield round ratings that are more indicative of how well a given round was actually played... isn't that the purpose of round ratings?

Admittedly, "substantially different" is open to interpretation by the TD, but many PDGA rules effectively leave things to the TD's judgement. If the TD thinks things are so different from one round to another, they should be able to state such to maintain the integrity of their event.

The thing about "substantially different" is that, based on what Chuck is saying, rounds that are combined are combined because the stats objectively say they aren't substantially different. It's not as though a round where the SSA worked out to 53.1 are being combined with a round on another day where conditions inflated the SSA to 59 or something. If the SSA on each day comes out within a stroke of each other, combining them doesn't significantly change the outcome of the calculation...it's likely a shift of no more than 0.5 or 0.6 in either direction. That's only a difference of a few ratings points in the end. And realistically, there's no notable difference between a 998 and a 1003 rating. It's less than a throw.
 
I guess I'll just say "fair 'nuff" to both points, and consider myself slightly more enlightened on the subject.
 
Here's an exaggerated example to keep the math simple. Let's say one round produces a 50 SSA and another round produces 51 on the same layout. Let's say the first round had no wind and there was a light breeze in the 51 round. Let's say there's a 95% chance that the SSA produced under no wind ranges from 48-52 for this smaller pool of propagators and it's 49-53 for the round in the breeze.

If all the ratings team knows is the 50 & 51 on the same layout and nothing about the wind, statistically we know these two rounds could have been thrown in similar conditions so we combine them. Even if the TD says the second round was windy, the numbers are not far enough apart for us to be sure there's a reason to rate them separately.

In fact, the numbers and probable variance ranges overlap enough that it's possible the 51 SSA was produced during no wind and the 50 was produced with the wind. If we kept these rounds rated separately, players would wonder what's going on when they knew the windier conditions were "tougher". That's what happens on a regular basis when you see unofficial ratings where a windier round gets lower ratings for the same score. It's just the normal variance in the precision of the ratings process, especially with a low number of propagators. Once the two rounds are combined, players end up with official ratings with the same score getting the same rating.

So it's better to rely on an objective statistics process to determine whether to combine or separate rather than specifically rely on whether the TD posts the weather report. It's more consistent for everyone.
 
I think that happens often when the same layout is played for two rounds by the same pool of players. It might be windier in the afternoon, but the players learned a thing or two the first round and corrected it for the second.
 
The ratings system is a big selling point for memberships. It is wise for the PDGA to protect their intellectual property.

Could not disagree more. Yes, the rating is a big thing about the membership. But that is due to it being official. Not because it is something that is concocted semi-secretly. Non-members can still derive their own rating from events they participate in - and as you say, the general principle is not exactly a secret, so it is not as if it could not be replicated. That more or less sinks the IP argument.

Having a helping of secret sauce in the ratings system is about as unprofessional as it gets, imho, when it comes to sports and games. That part is more off-putting than inviting.

It was a big thing for me when I started playing ball golf that I could calculate my own handicap, before joining a club and getting an official one. It fueled the ambition of getting better and was a major factor in changing me from the occasional player to a dedicated and ambitious player - and hence member of the governing organisations.

Let people include themselves before they actually join up. Give the casual player access to an official measuring stick and he/she will soon want the stamp that makes his use of it official.
 
If all the ratings team knows is the 50 & 51 on the same layout and nothing about the wind, statistically we know these two rounds could have been thrown in similar conditions so we combine them. Even if the TD says the second round was windy, the numbers are not far enough apart for us to be sure there's a reason to rate them separately.

Could it not be argued that "we don't know" would be the reason for not combining them?
Is it not one the core ideas in the ratings system that you do not make such judgements?

Those two results might also be outliers in "ranges" that do not overlap, correct?
Yet all such cases will be treated as if they _must_ be from overlapping ranges?

I still don't understand what this solves.

I have heard more questions about how the same score can produce the same exact rating in two different rounds, than questions about how the the same score can give different ratings. People tend to understand the latter - that you are playing the field. Not so much how two different rounds can be treated as if they were the same - as that flies in the face of the concept.

That is purely anecdotal ofc, so take it as that.
 
That is purely anecdotal ofc, so take it as that.

I think this is an important thing to look at. Anecdotal arguments can be made for just about anything, Chuck's point is that looking only at the numbers and not having subjective criteria based on anecdotal evidence avoids those issues. Why does it matter if there were slightly different conditions if the scores show the rounds played basically the same? Does half a stroke worth of ratings points in either round matter enough to not use the data in the most statistically significant way?
 
Top