Sunday, March 2, 2014

Bayes Theorem and the College Football Title Contenders (Paul Dalen)

Before I dig into the results of the analysis, here's a quick summary of how I derived my starting data and the assumptions I had to make.
First of all, Bayes Theorem requires at the data points used to estimate the probability be mutually exclusive and independent.  In this case mutually exclusive was not an issue...I chose the condition "12 or more wins" and "less than 12 wins" as a condition.  Since a team can't be both, it's mutually exclusive.  The independence is harder to establish.
The Pythagorean Theorem of Football is based on average points for and average points against a team.  It gives a number between 0 and 1 which, when multiplied times the total number of games a team played in a season gives an expected wins total for the season.  It is a direct calculation from on-field play.  The blue-chip percentage has an impact on the expected wins...it's been well established that having blue chips on a team is important if a team aspires to playing at an elite level.  Because of this, I cannot say with confidence that the two are independent for that reason.  It doesn't invalidate the utility of Bayes Theorem, however.  The results, however, are less useful as the direct relationship between data points grows closer.  I believe that the dependent relationship between a Pythagorean Expected Win Total and the percentage of blue-chip recruits on a team is sufficiently distant to find significant goodness in this approach.
I chose the Pythagorean Theorem of Football because it is a good indicator of whether a team's win-loss record is congruent with its performance on the field.  It is a good indicator of whether a team will do better or worse the next year.  Using the blue-chip percentage is a good proxy of the amount of pure talent on the team.  In doing a prediction model this way I'm making the assumption that a team's schedule will be similar from year to year.
The analysis is built on calculating the probability that a team will reach 12 regular season wins.   In order to calculate this, I developed probabilities that a team with a difference between it's actual wins and expected wins would experience an improvement in its win totals of a given amount.  For instance, In 2013 Bowling Green had an expected win difference of -1.9.  This is strong evidence that its actual win total was not truly indicative of its potential.  There is strong historical evidence that teams that fall 2 games or more below their expected win totals will experience a bounce back the next year.  For this reason, the probability that a team with a Pythagorean Win difference between -1.75 and -2 will improve at by no more than 3 wins is about 30%.  Florida State won all its games, so it had an expected win difference of .54.  The probability that it would improve by no more than -1 wins is .52.
I've attached the table of probabilities at the end of the article.
Next, I calculated the four-year average blue-chip percentage for each team from 2008 to 2013 and used that to establish probabilities that a team with a given blue-chip percentage would reach 12 wins.  The table of probabilities for this at that end of the article as well.
With those two probabilities I calculated the probability that a team would reach 12 wins, GIVEN that it had a certain percentage of blue-chip recruits on the team.
The top-10 in this method are below.
RankSchoolPrior Pythag Diff4 - year BC %P(12 wins)
1Florida State0.5410.5150.804
2Louisville0.1230.1260.562
3Ohio State0.4520.7140.539
4Michigan State1.0320.1880.347
5Bowling Green-1.9170.0000.334
6Alabama-0.7470.7110.324
7Stanford0.2980.3830.245
8Marshall-1.1200.0180.245
9Baylor-0.1170.1430.234
10Oregon-0.1120.4930.234

No comments:

Post a Comment