At this point, I’ve tried everything.
For the past 15 or so years, I’ve been desperately trying to create the perfect statistical bracket. ESPN lets you fill out 10 brackets per account. I have three accounts and have been maxxing those babies out year after year.
It was pretty basic stuff in my formative years, but it didn’t take long to realize that creating a bracket based solely on points per game was a terrible idea—and would have Northwestern State beating Iona in the National Championship this year.
I eventually upgraded to slightly more advanced stats which don’t simply appear on the team’s home page for you to effortlessly place in an Excel spreadsheet—things like rebounding margin, assist/turnover ratio, etc. But still nothing really panned out.
Things got out of hand during my college years.
Maybe you viewed spring break as a time to unwind and blow off some steam down in Florida, but I always saw it as the most important time of the year to get real work done.
For a bracket I called “Hometown Heroes,” I spent roughly 12 hours figuring out the hometown of the starting five for each team and then calculating the distance each player’s family would need to travel to get to the site where their baby boy was playing.
The worst part of that bracket was that I needed to recalculate the distance for every two rounds as they changed sites. Actually, the worst part of the bracket was the end result, as I’m pretty sure it was the least accurate one I filled out that season. Any team with an international player in its starting lineup was completely screwed.
The year prior to that one, I made a formula that incorporated every “basic” team statistic under the sun into one mega formula—points, assists, rebounds, steals, turnovers, blocks, free-throw percentage, three-point percentage and probably others. The rationale there was that if using one stat is wildly unpredictable, perhaps using a dozen stats tips the scales towards the best teams.
As you can imagine, just getting that data into the spreadsheet was hard enough, but no matter how much I tinkered with the formula it was never good enough. Sure, I could weight it in such a way to get the national champion from one season, but it never rang true for other seasons.
What do you base your bracket on?
I did a weighted coin-flip bracket two years ago, which actually ended up being fairly accurate by completely dumb luck. I’ll do one again this year just for kicks and giggles.
With that one, each team’s seed number represented the number of times its side of the coin had to come up in order to win the game. For example, when a No. 4 seed plays a No. 13, the No. 4 seed just needs four heads to come up before 13 tails do. There were surprisingly a good number of upsets in that bracket, though, because probabilities are a crazy thing.
Aside from a select few which never translated to success in successive seasons, every method has been mostly garbage. Maybe one of these years I’ll break down and join the hipster crowd basing their entire lives on Kenpom’s tempo-free statistics.
I often wonder how demoralized he feels when his months of tireless work is defeated in a bracket pool by some kid who picks Michigan State to beat Miami because he just watched 300 for the 300th time and has come to the conclusion that yes, an army of Spartans could totally defeat a hurricane.
I would’ve given up on my quest years ago if not for the fact that I created a formula as a teenager that I would love to be able to beat just once as a grown man.
At this point, you’re probably saying, “You already have the formula? And you made us read 600 words about your boring life of filling out brackets?” Sorry about that. I had to make you earn it.
The formula is: (average margin of victory) * (conference RPI).
That’s it. All you need are two data points.
It makes sense, though, doesn’t it? Using just average margin of victory swings the early-round odds to teams that just rampaged through a weak conference. I can remember one year Niagara was a No. 14 seed, but the average margin of victory had them as a Final Four team after they spent half the season destroying the lowly MAAC.
On the other hand, just using conference RPI ends up giving you a Final Four of teams from the same conference. And who gets the edge when teams from the same conference play each other?
This formula gives you a happy medium in which teams don’t get overvalued for beating up on weak opponents, nor do they get overvalued for struggling against strong opponents. The national champion should be the team that dominated most regularly against the best teams.
Which teams does that statistic favor this year? It has Indiana beating Gonzaga in the championship with Florida and Louisville rounding out the Final Four.
Sounds pretty feasible to me.
It’s far from perfect—and oddly enough has every No. 11 seed beating the No. 6 seed in the first round this year—but it’s correctly predicted the Final Four at least four or five times in the past 15 tournaments.
When it wins you a bunch of jelly beans in your bracket pool, be sure to thank me.