Thursday, April 30, 2015

Stanley Cup Playoffs Round 2

I don't have much time right now, so I'll just leave the 2nd round predictions and probabilities here. Soon, I'll write a follow up regarding the model's performance in Round 1, as well as evaluating prediction accuracy and expectations.

But for now, here are the Round 2 predictions:

Tuesday, April 21, 2015

Closing a baseball game is not a discreet skill set

The entire Pirate bullpen is blowing it tonight.  Happens.  It's actually great when it all happens at once, because these guys are pretty awesome, and you'd rather their rare bad nights not be spread over multiple games.

Anyway, Steve Blass and Greg Brown on the TV side, just making this so much more difficult than it needs to be.

Blass is talking about how closing is the most difficult job in sports.  Brown is playing devil's advocate, not because he knows his stuff, but because he hates Blass and wants to show him up.  Anyway, the paraphrased discussion goes something like this:

BROWN: But you look at other relievers who often come in during much more difficult situations [1], whereas a closer just has to get three outs and not give up one or two runs.   I'm not disagreeing with you, Steve, I just want you to explain it for everyone at home.

BLASS:  Well... Uh... ask any manager[2].  It's just... those three outs are harder [3].  They just... everything about them matters.  It's do or die [4].

My paraphrased response goes like this:

1. We call those difficult situations "high leverage," Greg.  And just so everyone knows, leverage is measured by the volatility of the game outcome in any given situation.  If it's a ten run game, it doesn't much matter if either team has the bases loaded: the outcome is already pretty determined.  If it's a tied game, we consider it super high leverage because any one swing of the bat could make it a not-tied game.  Super easy to see how this works.  Super easy to realize that the asshole pitching in the 8th inning of a tied game is being entrusted with more responsibility than the asshole pitching the 9th inning of a 1-run game.

2.  Steve, I'm not going to ask any manager about anything for fear of this.  And also because managers are dumb and stupid.  They're the reason we have saves, and bunts, and blogs instead of early bedtimes. 

3.  Those last three outs are not harder.  They're not.  They're just not.  There is nothing about the rules, equipment, stadium, or players that changes significantly.  What changes is perception.  Radhames Liz comes in and gives up a moonshot.  Tony Watson comes in and gets blasted.  Mark Melancon comes in and is crap.  It's all the same,  but we only fault one of them because our memories don't go back more than one inning.  (Actually, mine does, and not to brag but I remember some assbag pitcher for the Cubs giving up a bases loaded double to Jung-Ho Kang.  By the looks of it, the three outs in the bottom 7th were some of the toughest to record.  Perception.)

4.  It's do-or-die when the game is tied and you are really close to the end of the game.  It is specifically not do-or-die when you have the lead.  When you have the lead, it is do-or-else-you-might-have-to-score-more-runs-but-at-least-you-won't-be-dead.  This is absolutely true when you're at home, usually true when you're on the road.

Pitching with the lead is a cushion that is unnecessarily bestowed upon a team's best reliever.  On those rare bad nights, everything is all the more glaring.  But look what stats teaches us: over a long enough timeline (like 162 games) we see a lot of shit and some of it doesn't mean anything, even if it is glaring.  The specific inning in which a reliever is crappy: meaningless.

Sunday, April 19, 2015

Sunday baseball wrap-up

Bucs are at .500!  This is great news, because they've been playing quality ball thus far, and don't deserve to be in the red.  Hooray!

Sunday Night Baseball just wrapped up its coverage of the Cardinals and Reds at 10:10pm.  The game started at 8:05pm.  This is incredibly good stuff.  MLB, you are doing something so right, and you deserve a pat on the back. 

About doing something wrong, Cincinnati manager Bryan Price is again a big dummy.  Mike "Sticky Fingers" Leake had pitched 7 innings of 1 run ball when Jon Jay leads off the bottom of the 8th with a double.  Runner on second, no outs, tie game, bottom of the 8th... pretty clear that you really need a strikeout if you're the Reds.  And if you're the Reds, good news: you have this swing-and-miss monster named Aroldis Chapman in your bullpen!  But Bryan Price is like "No.  This one's for Dusty." and he leaves Leake in to pitch to Yadier Molina.

Leake isn't a bad pitcher, but he's not what we call dominant.  He pitches to contact and hopes the ball stays on the ground.  Yadier Molina isn't a dominant hitter, but he is good at making solid contact.  Maybe he comes through with a seeing-eye single here; I'm sitting at home feeling either the deep fly ball to advance the runner or the fielder's choice.   Yadi gives me the fielder's choice, 5-3.

Now there's a runner on third, one out, tie game, bottom of the 8th.  You absolutely need a strikeout in this situation.  Bringing the infield in does nothing to stop a fly ball, and if anything, it decreases the amount of ground your fielders can cover.  Maaaybe you could make a case for an intentional walk followed by wishfully-thinking your way to a double play ball.  But really you just need to get an out without the ball being put in play.  Left handed Kolten Wong deserves to be overpowered in this situation.  He actually yearns for it. 

"Don't do it!" whispers the ghost of Dusty Baker.  "You've gotta respect what Leake has done to get you where you are."

No!  Bad advice!  Don't listen to him, Bryan!  He's trying to get you fired, too!

"A starter finishes what he starts.  You've gotta save your closer for when you have the lead."

Shut up, Dusty, shut up!

"If you do anything here, it should be to warm up two relievers who aren't Aroldis Chapman, just because."

Oh my god, Dusty, you're the reason he did that!!


Right.  So, as anyone who watched knows, it's too late for blogging now.  At least if you're the Reds. Leake gave up the very predictable flyball.  Run scores.  Cardinals record 3 outs, complete the sweep. 

We know SO much about the effectiveness of pitchers.  We know that they get worse the longer they stay in the game.  We know that certain match-ups are superior to others.  We know that Aroldis Chapman is better at throwing a baseball than all but like 30 people in the entire world (none of whom are Mike Leake).  We also know that sometimes a decent but not consistently great pitcher like Mike Leake can go toe-to-toe with a stud like Adam Wainwright for 7 innings or so.  This is what we call: playing with house money.  It is small sample size success, and it has no bearing on a guy's ability to deliver beyond his normal means.

I fully expect Part II of this article to be up tomorrow.  As well as a gif of Kolten Wong saying "But I yearned to swing through three 103mph fastballs!"

Wednesday, April 15, 2015

2015 Stanley Cup playoff predictions

It's time for my second annual NHL playoff predictions! For a background on last season's model, check out the post from last year. If you're new to some of the more newfangled hockey stats, check out this great introductory piece. Now, let's dive in!

Data Sources (check out these sites)
Man Games Lost

The Model

This year's model was similar to last year's model despite my ability to test many more variables. Within the past year, we've seen some great hockey data sites (notably war-on-ice) come online, and we have more data than ever before. Last year's model consisted of score-adjusted Fenwick percentage, 5v5 save percentage, and penalty killing percentage. I tested more variables this year, including: score-adjusted Fenwick percentage over last 20 regular season games, score-adjusted Fenwick for per 60 mins, score-adjusted Fenwick against per 60 mins, late season (March and April) score-adjusted Fenwick for per 60 mins, late season score-adjusted Fenwick against per 60 mins, 5v5 save percentage of anticipated series goalie, 5v5 adjusted save percentage of anticipated goalie, 5v5 high danger save percentage of anticipated goalie, shorthanded Fenwick against per 60 mins, adjusted short-handed save percentage of anticipated goalie, power play Fenwick for per 60 mins, and Time Missed Impact to Team.

I ran a logistic regression model and used ten-fold cross validation to test its predictive ability. Model details may be coming in another post. The final model consisted of the following 4 predictors: score-adjusted Fenwick percentage in last 20 games, team 5v5 save percentage, penalty killing percentage, and power play percentage. Like last year, power play percentage was not statistically significant, but it did slightly help the predictive ability of the model, so I left it in. I used data from the 2007-2008 season up through the 2013-2014 season to construct the model. The sample size is 105 playoff series over that time period.

The Simulations
The logistic model can calculate the probability of a team winning a playoff series. It can take any potential match-up, input the teams' peripheral statistics, and calculate a probability of outcome. Using those probabilities, we can simulate the playoffs a whole bunch of times (10,000 in our case) and see how often each team wins.

The Caveats  

We don't know everything. We can't accurately capture everything that happens on the ice and turn it into sure-fire predictions. There's a lot of inherent randomness in hockey, and a lot of data that we're not yet able to collect. But what we can say is that teams that are good at puck possession heading into the playoffs, good on the penalty kill, and good at 5v5 save percentage are more likely to win than teams that aren't as good at those things. 
Of course, a goalie can go on an incredible run and carry a team. That's how the Bruins won the Cup in 2011. Any team can beat any other team in a small sample size series. I have also not adjusted for injuries. We know that Kris Letang is out for the Pens, and that Christian Ehrhoff and Derrick Pouliot are also banged up, and this really hampers their defense corps. Max Pacioretty may not play for Montreal. The predictions should be able to give us an idea of which teams may be overrated or overlooked, and it should give us an idea of which teams are more likely than others to go deep into the playoffs. The inherent randomness in hockey makes a lot of individual series too close to call, and it also makes for a lot of drama and fun. 

The Predictions

The model loves the Pens. Among playoff teams, the Pens were the best at score-adjusted Fenwick percentage in their last 20 games. They were third-best on the penalty kill. The Rangers have the edge in terms of save percentage, but they have struggled with puck possession. It seems crazy to think that the Pens, who struggled mightily down the stretch, are favorites against the team that won the Presidents' Trophy. But here we are.

The model also loves the Caps. They are slightly better than the Islanders at puck possession, but their odds are so good because of goaltending and special teams. The Islanders are terrible on the penalty kill, and the Caps are great on the power play. Braden Holtby has had a very solid season for the Caps. Meanwhile, the Islanders have the worst 5v5 save percentage of any playoff team.

Anaheim is another team that looks vulnerable. The Jets have been great down the stretch. They've both been good puck possession teams in their last 20 games, but Winnipeg has a slight edge. They also have a slight edge in goaltending and special teams.

Now let's take a look at the conference and Cup predictions from the simulations:

I know what you're thinking. It's similar to what I'm thinking. It's kind of shocking that the Pens are at the top of this list, but their underlying stats have been very good. They've been snake-bitten by low shooting percentage, bad luck, injuries to key players, and salary cap mismanagement that forced them to play several games with only five defensemen. A couple of teams with very good records, the Rangers and Canadiens, are near the bottom of this list. They're underwater possession teams, and the model does not think these teams are likely to win three or four playoff series. Of course, with Henrik Lundqvist and Carey Price, anything's possible. 

The Ducks have been a decent team of late. The model is down in their chances because of their goaltending and their path to the finals. They'd have to beat a good Jets team and likely one of Chicago or St. Louis to get to the finals.

It's probably most constructive to think of these results in terms of a group of teams that rise to the top. Looking at the probabilities, there's a 77 percent chance that the Stanley Cup winner comes from this group: Pittsburgh, St. Louis, Washington, Winnipeg, and Chicago. 

Looking back to my model from last year, my initial predictions showed an 85 percent chance of the Cup winner coming from this group: Boston, Los Angeles, NY Rangers, St. Louis, and San Jose. 

I'll keep updating the predictions and as the playoffs progress. Enjoy the first round!