Friday, 28 June 2013

Slicing and Dicing: Re-Examining Data on Draft Eligible D-Men

I love getting feedback and hearing from other people when I write stuff, even if it's simply to tell me that I'm an idiot for reasons X, Y and Z. It gives me ideas on how to examine my arguments in different ways and see if my initial conclusions were either flawed or could be refined. A couple of people made some good queries this morning that I feel are worth looking at, so that's what I'll take this time to do.

First off, @garik16 raises a valid point. Assists may be influenced by teammates in both positive and negative ways, so perhaps looking at goals per game played will yield a clearer picture of offensive ability than points per game does, and subsequently forecast NHL success more accurately:
As it turns out, the scatterplots for G/GP vs. NHL% of GP and  Pts/GP vs. NHL% of GP behave in very similar manners. Both have large clusters of non-scoring early-round busts at the bottom left hand corner, while the rest of the plot is pretty random. In both cases, there isn't enough of a linear relationship present to conclude a direct correlation between offensive output in the CHL and NHL success, however, as I said before, there is enough evidence to conclude that guys who don't score in the CHL don't become NHL players. Here's the plot:

It looks like there could be a linear relationship here if the picture wasn't clouded by so much noise at the bottom and so many outliers to the right. I'm not about to eliminate half of my data though, so we're left with this mess.

What I find interesting is that when you move to 0.25 CHL draft year G/GP, most of the guys end up flaming out before making the NHL. In fact, here are the top-10 CHL defensemen in this sample, sorted by G/GP:

Personally, I think this list is hilarious. The only "regular" NHLers here are St. Louis Blues spare part Kris Russell, journeyman Steve Eminger, and one of the all-time busts in Cam Barker. The rest of these guys can't even crack an NHL roster full-time. So how did they perform so well in their draft years? The most probable explanation is something that we see in the NHL all the time: individual shooting percentage is volatile in small sample sizes, leading to inflated goal totals over a single season.

It's circumstances like this where possession numbers as well as deployment patterns and shooting percentages would be very informative. After all, we're looking at such a small sample in "defenseman goals in a single short season." Possession metrics effectively expand the sample size allowing us to paint a more meaningful picture of what's happening on the ice over a shorter time period. Hell, even looking at just individual shots on goal should expand the amount of data available by 2000% (assuming a CHL d-man shoots at something like 5% on average). Unfortunately, this data wasn't available to me when I was data mining over on so it'll have to wait for another day, if it even exists at all.

So to finally answer what @garik16 was wondering, looking at G/GP doesn't tell us much, at least not about the data I have. This is probably because goals provide such a small sample of data that they're just too volatile to conclude anything more than what we already knew merely by looking at Pts/GP. While assists may introduce "quality of teammates" noise, eliminating them introduces a lot of randomness noise.


The other point that was mentioned by a couple of people was that guys drafted in the first round tend to be offensive guys, with the implication that guys drafted in the 2nd and 3rd rounds are regarded as riskier anyways:

First off, I'd hesitate to call it a "0.6 Pts/GP rule" since I think putting a set-in-stone number on it tends to make it look like I believe what I'm presenting is gospel. It isn't. I'm merely trying to present evidence to make the point that defensive ability is probably overvalued in CHL defensemen, and that GMs probably shouldn't spend early round picks on guys who exhibit no offensive ability.

Anyways, to test to see if my theory holds up when looking at the data round by round, I divided the total sample of players up in to three batches, based on Derek Zona's "Impact Player Percentage." This is essentially a percentage measure of how many players from a particular grouping in the draft turn out to be impact NHL players. Based on Zona's work, a top-25 pick turns into an "impact player" roughly 39% of the time, a player drafted anywhere from 26th to 50th becomes an impact player 15% of the time, and someone taken between picks 51 and 100 becomes an impact player about 7% of the time.

Of course, my definition of a "successful pick" is different than what is defined as an "impact player." Since my definition doesn't stipulate how frequently a player must score at the NHL level, my definition should yield more "successes" than there are impact players. Therefore, the success rate of NHL teams drafting a defenseman in each batch of picks should, in theory, exceed the impact player percentage of that same batch. In other words, since 39% of top-25 players become impact players, significantly more than 39% should be what I call successful picks.

I then divided each batch into two sub-groups of "scorers" and "non-scorers." If defensive defensemen are just as risky or less risky than offensive defensemen, the success rate of each sub-group should be similar to one another, and above the impact player percentage for that range of picks. Here's what the batch-by-batch relationship between NHL % of GP and CHL Pts/GP looked like:

What I notice right away is that a higher proportion of scoring defensemen are drafted in the first 25 picks compared to the 26th - 50th picks and the 51st - 100th picks. However, within these batches, does my initial conclusion that primarily defensive guys are still more risky hold true? I assembled the information into a table to check:

As you can see, scoring CHL defensemen turned into successful draft picks at a higher rate than defensive CHL defensemen in every single batch. Even though the success rate of non-scoring CHL d-men exceeded the batch impact player percentage (labelled as "impact rate") in both the first and second rounds, the difference is so small that I'd feel confident in saying that a well lower than average amount of these picks go on to become impact NHL players. An NHL team would be far better off using the pick to take a scoring defenseman or a forward.

So, after analyzing the data in yet another way, I'm left with the same conclusion: non-scoring CHL defensemen carry a disproportionate amount of risk and fail to become NHL players more often than not. I will say that I have softened my stance on Samuel Morin and Nikita Zadorov based on the information I looked at today, however I still believe that they're both unlikely to become NHL regulars and shouldn't be drafted if guys like Pulock, Morrissey or Theodore are still on the board. Even "risky" Jordan Subban is as likely to become an NHL regular as "safe" Zadorov is. Also, guys like USNT product Steve Santini should probably be bypassed entirely in the early rounds of the draft because their offence just isn't there. I hope I've done enough to demonstrate that defensive ability in the CHL is overvalued and that offense and puck skills have historically carried far less risk among draft-eligible defensemen.


  1. Great, great stuff. I love concrete evidence like this, based on hard data. Love it.

    I was having 'drafter's remorse' over the Canadiens' decision to bypass Jonathan-Ismaƫl Diaby and Mason Geertsen in the June draft. I feel much better now.

    One question: I have a feeling that the goalposts are being moved, that the game is being refereed at such a 'reckless disregard' level that crashers and bangers and crosscheckers become de facto impact players. Note the 'impact' 6'4" Sens defenceman Eric Gryba had in last year's first round of the playoffs. So, is there a way to factor in player size? How about we just consider players 6'2" and above and/or 210 lbs and above, put the data in the machine, crank the handle, and see what comes out? Do the low-scoring defencemen who reach a critical size threshold start to overcome their lack of talent, and are they given every chance to succeed, given icetime in the minors out of proportion to their skill, so that their odds of making it to the NHL and having a career is higher than a peer with similar lack of scoring success who is of more modest size?

    Of course, we'd have to compare this discrepancy in low-scoring defencemen, if it exists, to that which endures for the general population, at all positions, since the game is already tilted toward bigger players, toward Colton Orr and Greg Campbell, and away from Martin St. Louis and David Desharnais.

    So if we find there is a statistically significant greater likelihood of a bigger low-scoring defenceman making it over a smaller one, then we have to allow that Nikita Zadorov and Samuel Morin have the cherished size that modern coaches and GM's love, and they'll coddle and develop the hell out of these guys. And if they can't pass or shoot but can stand in front of the net and be a cross-checking menace to life and teeth, then they'll have long, rewarding, bloody careers in Gary Bettman and Colin Campbell's NHL.

  2. How long do diced avocados last in the fridge?
    osrs price guide