Monday, May 5, 2008

the Poll Study: Methodology

Before leaving comments critiquing this study of the polls, please read the following explanation… it should answer a lot of your questions.

Adjusting the Data

If you add up the number of losses by Top 25 teams in this table, you’ll see that there are more losses accounted for than the 1,028 games and 1,041 games I used to calculate the averages in the main table. I had to drop 5 games from the AP data and 8 games from the Coaches data – here’s why.

Right off the bat, there’s one game we have to throw out because it skews the averages, and it should be no surprise – #5 Michigan losing to I-AA Appalachian State in 2007. This game was a perfect storm of an upset. It was the only time in history that a I-AA team had beaten a Top25 team, and Michigan dropped -27 spots in the AP Poll and -22 spots in the Coaches Poll. Because of Michigan’s high ranking, the fact that it was the first week of the season, and the fact that Appalachian State was I-AA, including the -27 & -22 into our averages will give an inaccurate view of what the usual Top 5 team’s drop was after a loss, or what the September drop after a loss was, etc. This game is included in all of the relevant counts of weeks and Michigan’s numbers, but the drops from #5 to #32 and #5 to #27 aren’t part of the averages.

All of the other games I didn’t use in the averages were part of situations whereby a team played two games between rankings. This happened for two main reasons: first, a team played a late-weekend game after the rankings came out (like a Sunday night or Monday game) and then played again the next Saturday; second and more common, the polls skipped a week and a bunch of teams played two games between sets of rankings, throwing that whole next week’s poll out of whack for them. This only happened three times and it always involved the first week of the season. In 1999, the Coaches skipped rankings after the first short week of the season – only five games were played the last weekend in August. The following week, seven of those ten teams involved played their second game of the season, racking up two games between the preseason and week two rankings. In 2003 and 2004, the same thing happened only this time the AP voters skipped the first short week of the season. (That’s the reason the Coaches have 184 weeks of rankings and the AP has 183.) In all of these cases, we need to eliminate these rankings from the study and averages because there’s no way to tell which of the two games had the most impact on a team’s movement up or down.

Examining the Data and Factor

Let’s get this very important thing straight – No, just because I’m saying that factors such as Upset or Month are relevant or correlated to how much a team drops after losing, that doesn’t mean that I’m saying those categories cause said drops. The general cause of quality play (or performance) can be invoked for some factors, such as Margin of Loss or maybe Upset – the less you lost by, the better you played, or maybe losing to a better team showed your ability to perform at a high level. But how does quality play combine with the Month, or your Rank, or how long you’ve been ranked? Beats me. I have no idea why voters choose to drop some teams more or less than others – but this study shows that they definitely do. The correlations are there between factors and number of spots dropped.

Here are all of the relevant tables showing which factors correlate well and which don’t.

Combining Factors

So how did I determine that Upset, Month, and Rank/MoL are the key factors to the polls? Well, as I mentioned in the original post, I simply combined the factors. By doing so, we can assess the percentage of time that each factor correlates with the averages – the higher the percentage, the more correlation. Let’s take a look at one of them.

From our initial numbers, we know that
1) upsets drop you further than non-upsets,
2) the earlier in the season you lose the more spots you drop,
3) the more points you lose by the more spots you drop

So in theory, those statements should hold steady when all other factors are equal. The table below lists all three of those factors together:

Looking at just the first four lines, we can see that #1 and #2 are equal – the losses tallied were all non-upsets in bowl games. #3, the margin of loss, goes from smallest to largest. But do the average drops? They should, if margin of loss is highly relevant. It looks like it does in the Coaches poll, from -3.1 up to -5.6, but not in the AP poll – the average drop for a -4 to -8 point loss is larger (-3.2) than the average drop for a -9 to -16 point loss (-3.0).

(If you want to examine the table in this way yourself, hold down the shift button as you click on the column headers. Try any two of the factors and then one of the averages - that'll show you the third factor.)

Upset Month Margin of Loss AP avg. drop CO avg. drop
noU Bowl -03 or less 14 -02.6 15 -03.1
noU Bowl -04 to -8 13 -03.2 12 -03.2
noU Bowl -09 to -16 13 -03.0 15 -03.5
noU Bowl -17 or more 22 -05.2 22 -05.6
noU Nov -03 or less 13 -03.1 13 -03.8
noU Nov -04 to -8 21 -03.0 20 -02.8
noU Nov -09 to -16 28 -04.9 25 -05.2
noU Nov -17 or more 50 -06.7 52 -06.7
noU Oct -03 or less 19 -03.2 18 -03.3
noU Oct -04 to -8 25 -04.0 26 -04.7
noU Oct -09 to -16 30 -05.9 30 -06.4
noU Oct -17 or more 43 -08.3 42 -08.4
noU Sept -03 or less 13 -03.2 14 -03.6
noU Sept -04 to -8 16 -03.7 17 -05.8
noU Sept -09 to -16 11 -08.2 13 -09.3
noU Sept -17 or more 41 -10.3 36 -09.9
U Bowl -03 or less 17 -04.6 18 -05.0
U Bowl -04 to -8 13 -05.8 14 -05.1
U Bowl -09 to -16 21 -05.4 20 -05.1
U Bowl -17 or more 17 -07.9 18 -07.3
U Nov -03 or less 56 -06.9 54 -06.6
U Nov -04 to -8 60 -06.7 64 -06.8
U Nov -09 to -16 44 -07.8 46 -07.6
U Nov -17 or more 57 -08.4 56 -07.8
U Oct -03 or less 63 -09.5 65 -08.6
U Oct -04 to -8 55 -09.2 58 -08.2
U Oct -09 to -16 34 -09.1 36 -08.8
U Oct -17 or more 48 -10.4 52 -09.8
U Sept -03 or less 46 -09.8 47 -09.3
U Sept -04 to -8 49 -11.4 47 -11.0
U Sept -09 to -16 31 -12.6 30 -12.1
U Sept -17 or more 45 -14.7 46 -13.5

Expanding that analysis out over the whole table, I tallied 1 point for every time a combination of factors was in the "right" place, and ½ point for every time a combination was within a spot of the right place. For example, the first line should be -3 or less – if it is, 1 point, if it’s -4 to -8, ½ point. If its -9 to -16 or 17+, 0 points. Divide the total number of points by the number possible, 32, and you get a percentage.

Try the combination of Month, MoL, and either Avg Drop. Since that combo gives you a look at upsets, how true does the "Upsets drop you further than non-upsets" rule hold? Very true - 100%, actually, for both the AP and Coaches poll. So we can turn that percentage into the statement "when Month and MoL are equal, a non-upset will ALWAYS drop you less than an upset". Through a whole bunch of combinations, it's apparent that Upsets is the most relevant factor and Month is a close second. When combining those two with the other relevant factors we get the following percentages:

Team Rank:_______AP-90.0%, CO-82.5%
Opp. Rank:________AP-68.8%, CO-68.8%
Team in BCS Conf?:__AP-75.0%, CO-75.0%
Opp. in BCS Conf?:___AP-100%, CO-100%
Marging of Loss:____AP-81.3%, CO-93.8%
Opp. Prev W:______AP-43.6%, CO-41.8%
Opp. Prev 3W:_____AP-53.9%, CO-50.0%
Weeks Ranked:____AP-53.7%, CO-58.3%

Since the Team Rank percentage is highest percentage in the AP Poll, and the MoL percentage is highest in the Coaches Poll, those are the two factors I decided to use to compute the averages.

What about the 100% that the Opp in BCS Conf? factor got in both polls? Yes, it is the most relevant of all these other factors. But that brings us to another point:

Sample Size

Since all of these factors have some relevance, why not just combine them all into one big table and use that for averages? Well, because the more factors you look at, the smaller the sample size becomes. The smaller the sample size, the less accurate the averages. Each of the 32 combinations of factors in the table above takes around 30 games into account to get their average. If we add even the relevant factor of Opp in BCS Conf in, that takes us to 64 combinations that average around 15 games each. But even then, some of those 64 will only take one game into account, and some combinations wouldn't have any games at all. At the same time, we can't just go with a bigger sample size, since that would make the averages less accurate too. We could just use Upset, Month, and Opp in BCS Conf, but that only gives us 16 combinations that take around 65 games into account each. It's a tricky balancing act, finding a sample size that's not too big and not too small. The combinations I chose, Upset+Month+Rank for the AP and Upset+Month+MoL for the Coaches, I think are about as good as we can get. If someone has other numbers that show other combinations of factors being better or more accurate or more of a correlation, I'd be happy to use those instead.

(Incidentally, one of the other ways to measure these factors and combinations is checking the standard deviation of each - the combo I used for the AP averages a 3.01, and the combo I used for the Coaches averages a 3.16. That's actually the measurement I used when determining the Leasts and Mosts categories - anything outside of the standard deviation got a label.)

I think that just about explains things, but if you have any questions or comments, drop me a line or post them. As usual, this isn't meant to be a definitive or exhaustive or concrete study - it's meant to help people look at a part of college football from a different perspective. And as with the non-conference schedule study I posted around this time last year, the point is to detail what the poll voters and coaches do - I'm leaving the why for you.

No comments: