Tuesday, August 31, 2010

Can we predict hockey standings?

The age old question. Each September hockey columnists give us their predicted standings. We read along, caught up in the impending arrival of a new season. We don't even mind when Scott Burnside calls a team "plucky". Then, if you're like me, you realize that these predictions aren't based on any quantitative measures and you're left a little unsatisfied. So here's the first step to finding satisfaction.

There are a few adaptations of the Bil James Pythagorean Expectation out there, but I decided to look at things a little differently. Let's look at the relationship between points and goal differential for the 2009-2010 season, then for all seasons post-lockout.

Hey, that's really linear! Using all post-lockout data, we can use a linear mixed-effects model to fit the data and come up with our predicted point values. Here's how the model predicts the 2009-2010 Eastern Conference standings.


Washington 121
Washington 121
New Jersey 103
New Jersey 103
Buffalo 102
Buffalo 100
Pittsburgh 99
Pittsburgh 101
Philadelphia 96
Ottawa 94
Boston 94
Boston 91
NY Rangers 93
Philadelphia 88
Montreal 90
Montreal 88

Ottawa 88
NY Rangers 87
Atlanta 85
Atlanta 83
Carolina 83
Carolina 80
Florida 80
Tampa Bay 80
NY Islanders 78
NY Islanders 79
Tampa Bay 77
Florida 77
Toronto 74
Toronto 74

And now the Western Conference:


Chicago 113
San Jose 113
San Jose 109
Chicago 112
Vancouver 109
Phoenix 107
Phoenix 100
Vancouver 103
Los Angeles 100
Detroit 102
Detroit 96
Los Angeles 101
Colorado 96
Nashville 100
St. Louis 93
Colorado 95

Nashville 92
St. Louis 90
Calgary 90
Calgary 90
Anaheim 88
Anaheim 89
Dallas 86
Dallas 88
Minnesota 83
Minnesota 84
Columbus 77
Columbus 79
Edmonton 68
Edmonton 62

The model does pretty well. Can we use this to predict the 2010-2011 standings? Yes, but we'd have to estimate each team's goal differential. Unlike baseball, we don't really have individual player level data (something like RAR) so it's a bit difficult to project how many goals a team will score and allow. I guess that's step 2.


FRANCOfranco said...

What an amazing tease of a graph!

Unknown said...

As a founding author of this blog, I find the use of actual data and analytical skills here to be deeply and existentially threatening. You can not graph grit, or guts, or gsoul. I have tried. Clearly, this new author is harbinger of computers running hockey.