Showing posts with label Puckerings. Show all posts
Showing posts with label Puckerings. Show all posts

Friday, 7 November 2014

Puckerings archive: Arena Goal Factors (29 Oct 2002)

What follows is a post from my old hockey analysis site puckerings.com (later hockeythink.com). It is reproduced here for posterity; bear in mind this writing is over a decade old and I may not even agree with it myself anymore. This post was originally published on October 29, 2002.


Arena Goal Factors
Copyright Iain Fyffe, 2002


One possible source of ideas when doing statistical analysis in hockey is analysis done in other sports. Of course, baseball is the most obvious choice, since so much statistical analysis has been done in that field. But one must be careful when importing ideas to consider the differences that exist between the sports involved.

The concept of park run factors is an example. Park runs factors exist in baseball because different parks have different dimensions and conditions, thereby affecting the number of runs scored in each park. Several people have suggested that such an analysis could be done in hockey, but to my knowledge, no one has published any results.

First, we must ask the question: do factors of this kind (let's call them Arena Goal Factors, or AGF) make sense for hockey? I would say yes, there could be enough differences between arenas (in terms of dimensions, ice condition, etc.) to affect goal-scoring levels. I would expect the differences to be less than in baseball, but would not be surprised if they do exist.

For example, let's take a team that scored 120 goals at home and 100 on the road, and allowed 90 goals at home and 100 on the road. In this league, teams score 55% of their goals at home, while allowing 45% of their goals at home. We would therefore expect this team to score 121 goals at home and allow 86 at home, for a total of 207 goals at home. They actually had 210 total goals at home. Their AGF is therefore 210 divided by 207, or 1.014.

But before we can use this figure, we have to adjust for the fact that a team plays only half its games at home, and half on the road (in other arenas with other AGF figures). Since the sum of league AGF is equal to the number of teams, we calculate the Arena Goal Adjustment (AGA) as follows:

AGA = [(TMS-1)x(AGF)+(TMS-AGF)]/[2x(TMS-1)]

Where TMS is the number of teams in the league. I won't bother with the derivation.
So if the team in the above example played in a 25-team league, its AGA would be 1.007, meaning that players on this team would have their scoring totals increased by about 0.7% due to playing in their particular arena.

That's the theory, anyway. But I won't string you along any more. You can calculate AGA's for each NHL team for each season, but they are not the result of the nature of the arenas. They are random chance.

I calculated AGA's for six NHL seasons: 1990/91, 1991/92, 1994/95, 1995/96, 1998/99 and 1999/2000. If AGA were meaningful, there would be a strong relationship between the AGA for a team one year and the AGA for that team the next year. The results of this inter-year correlation is as follows: between 1990/91 and 1991/92, 0.34; between 1994/95 and 1995/96, -0.05; between 1998/99 and 1999/00, -0.37. The average correlation coefficient is -0.03, which suggests the relationship is entirely random.

For further support, I calculated the correlations between goals-for factors and goals-against factors for each team. If the effects were real, then we would expect to see both goals for and goals against affected in the same way. The results of this intra-year correlation are as follows:

 Year  Correlation
 1990/91  0.13
 1991/92  0.24
 1994/95  0.05
 1995/96  0.24
 1998/99  0.32
 1999/00  -0.03

The average correlation is 0.16, which is stronger than the inter-year correlation, but still nowhere near as strong as we would need to say there is a relationship there.

In summary, Arena Goal Factors do not exist in hockey. You can calculate them all you like, but overall they are the result of random chance and do not reflect anything meaningful.

Friday, 31 October 2014

Puckerings archive: Factors Affecting NHL Attendance (29 Oct 2002)

What follows is a post from my old hockey analysis site puckerings.com (later hockeythink.com). It is reproduced here for posterity; bear in mind this writing is over a decade old and I may not even agree with it myself anymore. This post was originally published on October 29, 2002.
 

Factors Affecting NHL Attendance
Copyright Iain Fyffe, 2002


This paper builds upon the work of Wiedecke, who examined factors affecting NHL attendance using a multiple linear regression model. A summary of this work follows.

Data from the 1997/98 NHL season were used, giving 26 data observations. The dependent variable used was the percentage of capacity (called "Attendance Capacity"). That is, if a team averaged 15,000 fans in an arena with a capacity of 15,500, the team had an Attendance Capacity of 97% (15,000 divided by 15,500). The independent variables used were standings points, goals scored, and penalty minutes (which are all self-explanatory), and location (explained below).

Location for each team was assigned a value of 1, 2 or 3 based upon the team's geographic location. A value of 1 was assigned to the northernmost teams (Calgary, Edmonton, Montreal, Ottawa, Toronto and Vancouver). A value of 2 was assigned to Boston, Buffalo, Chicago, Colorado, Detroit, New Jersey, New York Islanders, New York Rangers, Philadelphia, Pittsburgh, and St. Louis. A value of 3 was assigned to the southernmost teams (Anaheim, Carolina, Dallas, Florida, Los Angeles, Phoenix, San Jose, Tampa Bay, and Washington.

(1) by incorporating a larger data set;
(2) by redefining the dependent variable; and
(3) by introducing a new indepdendent variable.

 
Rather than using only the 1997/98 season, I will use data from 1995/96, 1996/97, 1997/98, 1998/99, 1999/2000, 2000/01 and 2001/02, giving 193 data observations.

 
I will use average attendance as the dependent variable, rather than percentage of capacity. By using the percentage, a team which fills 14,800 of 15,000 seats (98.7%) is considered superior to a team which fills 19,700 of 20,000 seats (98.5%). This does not reflect reality well, as the second team draws a full 33% more fans.

 
The independent variable added is Novelty. A value of 5 is assigned to a team in its first year in the league (after either an expansion or franchise relocation), and this is reduced by one for each subsequent year in the league until it reaches 0. The purpose is to determine if new teams get an attendance boost simply by being new, as if often postulated. The four independent variables used by Wiedecke are also used.
 

Variable Correlations
 
A variable correlation analysis is performed to examine the data for possible cross-correlation effects. Only one pair of variables, goals and standings points, has a significant correlation (positive 0.64). Therefore if both goals and points are found to be significant, care must be taken in their interpretation due to cross-correlation. Other pairs with less-significant correlations are attendance and points (positive 0.39), attendance and location (negative 0.31), and location and novelty (positive 0.30).

 
The following table indicates the coefficients of correlation for all variables used: attendance (ATT), points in standings (PTS), goals scored (GF), penalty minutes (PIM), location (LOC) and novelty (NOV).

 
 ATT  PTS  GF  PIM  LOC  NOV
 ATT  -  .39  .25  -.04  -.31  -.17
 PTS  .39  -  .64  -.28  -.17  -.19
 GF  .25  .64  -  .10  -.22  -.17
 PIM  -.04  -.28  .10  -  .04  -.01
 LOC  -.31  -.17  -.22  .04  -  .30
 NOV  -.17  -.19  -.17  -.01  .30  -
 

Results of the Model
 

The results of the multiple linear regression model are as follows.
 Constant (y-intercept)  13,326
 Standard error of estimate  2,071
 R-squared  0.223
 Variable  Coefficient  St. error  t-stat
 PTS  61.08  13.56  4.50
 GF  -6.90  7.16  -0.96
 PIM  0.80  0.61  1.31
 LOC  -778.93  211.43  -3.68
 NOV  -47.92  119.85  -0.40
 

Discussion of Results
 
The t-statistics of GF, PIM and NOV indicate there is little evidence that they affect attendance in any significant way. On the other hand, there is very strong evidence that PTS and LOC significantly affect attendance. These findings agree with Wiedecke.

 
Overall, the model is not extremely useful; the R-squared figure indicates only 22.3% of the variability in attendance is explained by the model. This may indicate there are other independent variables that should be considered.
The correlation between the two significant independent variables (PTS and LOC) is -0.17, indicating there is no significant cross-correlation effect.

 
Interpretation

 
According to the model, having a good team is the most significant factor affecting attendance. Ceteris paribus, each additional standings point increases attendance by 61 fans per game. A 90-point team therefore has a 610-fan advantage in average attendance over an 80-point team.

 
The location coefficient indicates that the further south a team is, the worse its attendance is. All else being equal, a team in the southern US averages 1,558 fans less per game than a team in Canada. This is significant because the NHL's recent strategy has been to put as many teams in the southern US as possible, either through expansion or franchise relocations (including moving teams from Canada to the southern US). The results of this model suggest that this strategy is seriously flawed. In this case, analysis agrees with common sense: why are markets where there are hockey fans ignored in favour of markets where there are no hockey fans? At least the most recent expansion was more logical, and didn't put any more teams in the Sun Belt.
 

Reference
 

Wiedecke, Jennifer. 1999. Factors Affecting Attendance in the National Hockey League: A Multiple Regression Model. Master's thesis, University of North Carolina, Chapel Hill.

Friday, 24 October 2014

Puckerings archive: Win-Things Theory (18 Oct 2002)

What follows is a post from my old hockey analysis site puckerings.com (later hockeythink.com). It is reproduced here for posterity; bear in mind this writing is over a decade old and I may not even agree with it myself anymore. This post was originally published on October 18, 2002.
 

Theory: Win-Things
Copyright Iain Fyffe, 2002


The most common perspective put forward on win theory can be summarized as follows:

Before a game begins, each participating team has a 50% chance to win (a .500 expected winning percentage), ceteris paribus. As the game progresses, and as each team does things that affect their chances of winning or chances of losing, the expected winning percentage of each team changes. For instance, if a team scores a goal after 5 minutes of play, their percentage may change to .550, and the opponent's would therefore be .450, since the percentages necessarily sum to one.

At the crux of this theory lie two ideas: (1) before a game begins, a team's winning percentage is .500, and (2) a team does two types of things that affect its chances of winning: good things (which we'll call "win-things") and bad things (which we'll call "loss-things".)

As a team, you have no significant control over what your opponents do. Therefore, at least from an analytical perspective, you can assume they will do an average number of things to win. At the beginning of a game, you have not yet done anything to win, and have no guarantee that you will do so. Therefore, your expected winning percentage before a game is not .500, but .000.

Teams try to win games, they do not try to lose them. Therefore a loss-thing is merely a failed attempt at a win-thing. Just as darkness is merely the absence of light, loss-things are merely the absence of win-things. Therefore win-things are what matters, and this is why I refer to this theory as Win-Things Theory.
 
The idea that you cannot control your opponent's actions is carried throughout the thoery. For instance, in the traditional theory, scoring a goal is a very good thing (i.e., it has a high Win-Things value). Under Win-Things Theory, whether or not a shots actually produces a goal is irrelevant to the shooting side. The Win-Things were produced by the shot itself, with a higher-quality shot producing more Win-Things. Conversely, the opponent's Win-Things on the play depend on whether or not the shot is stopped. Stopping the shot produces Win-Things about equal to the Win-Things resulting for the other side by taking the shot. Not stopping the shot produces no Win-Things (it does not produce Loss-Things).

 
It should be noted that the .000 beginning expected winning percentage applies only to one-team analysis. In two-team analysis, where the actions of both teams are considered, the expected percentage would depend on the Win-Things each team has accumulated. But generally speaking, one-team analysis is more useful in analyzing what contributes to winning, by assuming opponents to be average in all regards.

 
Traditional theory focusses much attention on expected winning percentage. Win-Things Theory does not. The point is not to get your expected winning percentage up; the point is to accumulate more Win-Things than your opponents. Since you cannot control how many Win-Things your opponents accumulate, the best way to ensure this is to accumulate as many Win-Things as possible.

 
This theory supports Bill James' Win Shares system for baseball, which I have adapted into the Point Allocation method for hockey. Win Shares has been criticized for not considering "Loss Shares". Using this new theory, Loss Shares are irrelevant, and the criticism is therefore invalid. Opportunity should still be considered, but fortunately in hockey games are timed, while in baseball the opportunities vary greatly from game-to-game, based on a multitude of factors.

Friday, 17 October 2014

Puckerings archive: Shots and Save Percentage (18 Oct 2002)

What follows is a post from my old hockey analysis site puckerings.com (later hockeythink.com). It is reproduced here for posterity; bear in mind this writing is over a decade old and I may not even agree with it myself anymore. This post was originally published on October 18, 2002.
 

Theory: Shots and Save Percentage
Copyright Iain Fyffe, 2002


In my investigation into the validity of Goaltender Perseverence, I looked into the relationship between the number of shots a goaltender faces per game and his save percentage. I found that, as the number of shots per game increases, save percentage does not decrease, on average, as the fundamental assumption of Perseverence argues. In fact, there is some evidence of a positive relationship; that is, as shots increase, save percentage increases.

This evidence was met with an "it doesn't make sense" reaction from those I presented it to. Well, common sense is often dead wrong. To explain this phenomenon, I present the following theory.

For simplicity, I will discuss only two types of shots: easy and tough (referring to the goaltender's perspective). There are in actuality many varying degrees of toughness of shots, but these two will suffice for our purposes.

Easy shots are largely discretionary. They are shots that result from situations where a player could choose to shoot, or choose another play. They are of lower quality than tough shots, because they are usually taken from a greater distance than tough shots, or less favourable circumstances.

Since easy shots are discretionary, there must be a reason that teams do not simply shoot every time, in order to maximize their goals scored. The reason could be twofold: you give up the opportunity to make a pass, which could result in a higher-quality shot, and the shot is more likely to produce a turnover, allowing a possible scoring chance for the opposition. Therefore, it is not always wise to take the shot rather than another play.
 
Save percentages on tough shots are low, and save percentages on easy shots are high. And since easy shots are primarily responsible for variation in shots faced by a goaltender (since the number of tough shots faced is relatively consistent), save percentage will increase as shots faced increases.

 
For example, let's say that the average tough shots faced per game is 5, and the save percentage on such shots is .800. This is the same for every goaltender. Any difference in shots faced is due to easy shots, which we'll say have a save percentage of .900.

 
A goaltender facing 25 shots will therefore face 20 easy shots (25 less 5). Goals against on tough shots is 1.0 (5 less .800 times 5), on easy shots 2.0 (20 less .900 times 20). 3 goals against on 25 shots is an .880 save percentage.

 
A goaltender facing 35 shots will have the same 1.0 goals against on tough shots, but will have 3.0 on easy shots (30 less .900 times 30). 4 goals against on 35 shots is an .886 save percentage. The goaltender facing more shots on average has a higher save percentage.

 
That is my theory of how save percentage can increase as shots increase. Unfortunately, this theory cannot be tested using information that is currently available. The NHL does track certain shot data (type, location) for shots that produce a goal, but not for shots that do not produce a goal. If this information were recorded for all shots, it could be used to test this theory.

Friday, 10 October 2014

Puckerings archive: The Cost of a Penalty (18 Oct 2002)

What follows is a post from my old hockey analysis site puckerings.com (later hockeythink.com). It is reproduced here for posterity; bear in mind this writing is over a decade old and I may not even agree with it myself anymore. This post was originally published on October 18, 2002.
 

Theory: the Cost of a Penalty
Copyright Iain Fyffe, 2002


The value of odd-man play is often debated. In the mass media, much ado is made about the power-play (and, to a lesser extent, penalty-killing), calling it a key to success. Others, such as Klein and Reif, downplay its importance, noting that even-strength play is better for predicting success. 

This essay takes a conceptual approach to this problem. What, in theory, is the importance of odd-man situations? To examine this question, I will examine a theoretical team, one which is average in all respects.

This team plays in three types of situations: even-strength (ES), power-play (PP) and short-handed (SH). Examining each of these situations reveals the answer we are looking for.

Even-strength: The team is completely average. Therefore, they will score exactly as many ES goals (ESGF) as they allow (ESGA). Thus, their expected net goal differential per minute of ES time (ESMIN) is calculated as follows:

( ESGF - ESGA ) / ESMIN

Which, for reasons discussed above, is zero. 

Power-play: On the PP, a team scores about three times as often as at ES, while goals against are cut in half. PP time (PPMIN) produces a net goal differential as follows, using 1998/99 figures:

( PPGF - SHGA ) / PPMIN
= ( 1533 - 220 ) / 16326 ... minutes figure is estimated
= 0.08

Short-handed: Since PP time for one team is SH time for another, SH situations produce the converse of PP, or -0.08 goals per minute.

Taking this all together, as average team will have a winning record if they can obtain more PP opportunities then they give. That's badly phrased, since a team with a winning record cannot be average, but you know what I mean. This is most easily accomplished by taking as few penalties as possible, since you have rather limited control over your opponent's actions.

From this perspective, odd-man situations are extremely important, as they decide games. The team taking fewer non-coincident penalties should win, on average.

If this perspective is valid, then we should be able to predict success based upon PP opportunities for and against. I tested the coefficient of correlation between net PP opportunities and standings points for a selection of recent seasons:

 1990/91  0.11
 1991/92  0.26
 1994/95  0.02
 1995/96  -0.02
 1998/99  0.63
 1999/00  0.23
 average  0.21

The correlations provide, on average, some support for the theory. They are generally positive, but not that strong (aside from 1998/99, which is very strong). But remember, we are not considering the quality of the teams, unless you consider taking few penalties to be a quality (which you should.) So there is some evidence that this theory is valid.

Friday, 3 October 2014

Puckerings archive: Greatest Teams of All Time (09 Oct 2002)

What follows is a post from my old hockey analysis site puckerings.com (later hockeythink.com). It is reproduced here for posterity; bear in mind this writing is over a decade old and I may not even agree with it myself anymore. This post was originally published on October 9, 2002.
 

The Greatest Teams of All Time
Copyright Iain Fyffe, 2002


The most thorough discussion of teams possibly deserving nomination as the greatest of all time is in Klein and Reif's Hockey Compendium. They base their conclusion that the 1929/30 Bruins are the greatest of all time on the team's .875 winning percentage, which is the highest of any team playing the minimum number of games.

There are, of course, two problems with basing the analysis solely on winning percentage. For one, an artificial games limit has to be introduced, to keep those 8-0-0 Montreal Victorias of 1898 and 10-0-0 Montreal Wanderers of 1907 from dominating the list. If we could avoid artificial restrictions like these, we could improve the analysis substantially. As it stands, these teams have no chance of being considered, no matter how great they may have been.

In addition, using winning percentage alone ignores the league context. That is, how good are the other teams in the league? Are there a few weak sisters to beat up on, or is parity the order of the day? Obviously, the greater the parity in the league as a whole, the more difficult it is to run up a high winning percentage. You don't get those cheap points; you have to fight for each win.
Therefore the analysis should be based on the degree by which a team dominates the competition, and the range of quality of said competition. One method to do this is explained below, by way of example. 

Let's examine the top two teams by Klein and Reif's analysis. The Boston Bruins of 1929/30 played in a league where the standard deviation of winning percentage was .188, which is fairly high for the era. Boston's winning percentage of .875 is .375 higher than the average (which is .500), or 1.99 standard deviations above the mean (.375 divided by .188). This is called a z-score, and this is what I will base my analysis on. It encompasses both how far above the competition a team was, and how much variation in quality there was between teams. Boston's Winning Percentage Z-Score (WPZS) is therefore 1.99, which is very impressive, but as we'll see, not the best of all time.

The 1943/44 Montreal Canadiens, rated #2 by Klein and Reif, had an .830 winning percentage in a league that a had a standard deviation of winning percentage of .215 (high due to the disparity in talent caused by the war). There was less parity in this league-year than in 1929/30. Montreal's WPZS is 1.53, which while quite high is nowhere near the best of all time.

This means that, relatively speaking, Montreal had a greater benefit of weaker teams to play against than Boston did. By analyzing teams in this way, we consider both the quality of the league and we remove the need for any arbitrary restrictions. Below is the list of the top 48 teams of all time (all those with a WPZS of 1.50 or greater), from among the NHL and its predecessors, as well as the PCHA and WCHL/WHL, and the WHA. 

The surprises start at the very top. The greatest team of all time, by this analysis, is the 1995/96 Detroit Red Wings. Their .799 winning percentage had them #7 on Klein and Reif's list. But the standard deviation that year was a mere .116, quite low for the era. Other than Detroit, the best winning percentage was .634. 19 of the 26 teams were between .400 and .600. Parity was the rule, yet Detroit was able to completely dominate the league. Their 2.58 WPZS is far and away the best of all time.

The next two spots come from two teams from the same season. The epic battle between Calgary and Montreal in 1988/89 is revealed to be of truly historic proportions. Other than these two teams, no team had a winning percentage of greater than .575, or less than .381. The parity this year was amazing; the standard deviation was only .100. Calgary's percentage was .731; Montreal's was .719. While both teams miss Klein and Reif's top 20, they're #2 and #3 here. Never has there been two teams which stood futher above the rest of the league.

Spot #4 is the 1976/77 Canadiens. Montreal's 1970's dynasty also makes appearances at #9, #16, #19, and #26. That's a hell of a decade, and it's no surprise that it shows up here.

Two more recent Red Wings sides take the 5 and 7 spots, with the Dallas South Stars outstanding 1998/99 campaign sandwiched in between. The great Bruins of 1929/30, ranked #1 by Klein and Reif, finally appear at #8.

If I were to ask you which Flyers teams was the best in their history, I doubt you would answer "the 1979/80 edition, of course!" But here they are in a tie for 9th with the best the Oilers have to offer, the 1985/86 team. Another 1980's Flyers squad (1984/85) appears at #22, well above the their best of the 1970's (1973/74), which comes in at a tie for #40. 80's Oilers teams also appear at #18, #36, #42, and #45. Not quite the 1970's Canadiens, but not bad.

The highest-ranked team of the pre-NHL era turns out to be the 1912/13 Quebec Bulldogs. In a league where the five other teams had records ranging from 10-10-0 to 7-13-0, Quebec went 16-4-0 to dominate the field.

The Houston Aeros were the WHA's greatest team, no surprise, claiming spots 13, 34, and 38. No other WHA club appears on the list.

Montreal's other great dynasty shows up a few times as well. 1958/59 is #18, 1955/56 is #25, 1957/58 is #28, and 1959/60 is #46. This is probably less impressive than the 1980's Oilers, but more than the Islanders teams which show up at #14, #23, and #42.

The Bruins of the early 70's don't show as well as you might expect, because they played in an expansion era. They appear "only" at #16, #24 and #32. The original Senators also appear thrice, at #25, #34 and #40, the last two from their pre-NHL days.

Finally we have the two perfect clubs mentioned before. Because these teams played in eras notable for their lack of parity, their 1.000 winning percentages are knocked down quite a bit on this list. The 1898 Victorias stand in a tie at #36, while the Wanderers show at #38. These teams (as well as the 1910/11 Senators at #40) were completely blocked out of Klein and Reif's list due to the artificial games restriction. Here, they get a fair shot.

The complete list follows:

 Rank  Team  Year  League  WPct  WPZS
 1.  Detroit Red Wings  1995/96  NHL  .799  2.58
 2.  Calgary Flames  1988/89  NHL  .731  2.31
 3.  Montreal Canadiens  1988/89  NHL  .719  2.19
 4.  Montreal Canadiens  1976/77  NHL  .825  2.18
 5.  Detroit Red Wings  1994/95  NHL  .729  2.08
 6.  Dallas Stars  1998/99  NHL  .695  2.05
 7.  Detroit Red Wings  2001/02  NHL  .707  2.02
 8.  Boston Bruins  1929/30  NHL  .875  1.99
 9.  Montreal Canadiens  1977/78  NHL  .806  1.97
 9.  Philadelphia Flyers  1979/80  NHL  .725  1.97
 9.  Edmonton Oilers  1985/86  NHL  .744  1.97
 12.  Quebec Bulldogs  1912/13  NHA  .800  1.94
 13.  Houston Aeros  1976/77  WHA  .663  1.93
 14.  New York Islanders  1981/82  NHL  .738  1.92
 15.  Boston Bruins  1938/39  NHL  .771  1.86
 16.  Boston Bruins  1970/71  NHL  .776  1.85
 17.  Montreal Canadiens  1972/73  NHL  .769  1.81
 18.  Montreal Canadiens  1958/59  NHL  .650  1.79
 18.  Edmonton Oilers  1983/84  NHL  .744  1.79
 20.  Montreal Canadiens  1975/76  NHL  .794  1.78
 21.  Colorado Avalanche  2000/01  NHL  .720  1.77
 22.  Philadelphia Flyers  1984/85  NHL  .706  1.73
 23.  New York Islanders  1978/79  NHL  .725  1.72
 24.  Boston Bruins  1971/72  NHL  .763  1.70
 25.  Ottawa Senators  1926/27  NHL  .727  1.69
 25.  Montreal Canadiens  1955/56  NHL  .714  1.69
 27.  Montreal Canadiens  1978/79  NHL  .719  1.67
 28.  Montreal Canadiens  1957/58  NHL  .686  1.65
 28.  Buffalo Sabres  1979/80  NHL  .688  1.65
 30.  St.Louis Blues  1999/2000  NHL  .695  1.62
 31.  Quebec Nordiques  1994/95  NHL  .677  1.61
 32.  Montreal Canadiens  1915/16  NHA  .688  1.58
 32.  Boston Bruins  1973/74  NHL  .724  1.58
 34.  Ottawa Senators  1916/17  NHA  .750  1.57
 34.  Houston Aeros  1974/75  WHA  .679  1.57
 36.  Montreal Victorias  1897/98  AHAC  1.000  1.56
 36.  Edmonton Oilers  1981/82  NHL  .694  1.56
 38.  Montreal Wanderers  1906/07  ECAHA  1.000  1.55
 38.  Houston Aeros  1973/74  WHA  .647  1.55
 40.  Ottawa Senators  1910/11  NHA  .812  1.54
 40.  Philadelphia Flyers  1973/74  NHL  .718  1.54
 42.  Montreal Canadiens  1943/44  NHL  .830  1.53
 42.  Edmonton Oilers  1984/85  NHL  .613  1.53
 42.  New York Islanders  1980/81  NHL  .688  1.53
 45.  Edmonton Oilers  1984/85  NHL  .681  1.52
 46.  Montreal Canadiens  1944/45  NHL  .800  1.50
 46.  Montreal Canadiens  1959/60  NHL  .657  1.50
 46.  Montreal Canadiens  1968/69  NHL  .678  1.50

For those interested in this sort of thing, here is the distribution of the top 48 seasons of all time: Montreal Canadiens 14; Boston Bruins and Edmonton Oilers, 5; Detroit Red Wings, Houston Aeros, New York Islanders, Ottawa Senators (first edition) and Philadelphia Flyers, 3; Quebec Nordiques/Colorado Avalanche 2; Buffalo Sabres, Calgary Flames, Dallas Stars, Montreal Victorias, Montreal Wanderers, Quebec Bulldogs, St.Louis Blues 1. Notably, half of the Original Six teams (Rangers, Chicago, and Toronto) fail to take a single spot, while the Habs have 29% of the top 48 to themselves.

Friday, 26 September 2014

Puckerings archive: Goal and Assist Z-Scores (04 Jul 2002)

What follows is a post from my old hockey analysis site puckerings.com (later hockeythink.com). It is reproduced here for posterity; bear in mind this writing is over a decade old and I may not even agree with it myself anymore. This post was originally published on July 4, 2002.


Goal and Assist Z-scores
Copyright Iain Fyffe, 2002


Methods have been developed in the past to identify dominant single-season performances. For example, some years ago I developed something I called goal-scoring dominance, which was calculated as the leading player's goals-per-game average divided by the second-leading player's goals-per-game average. A similar calculation was made for assists. I later discovered that Klein and Reif had developed the very same method years before, calling it Quality of Victory.

But this method suffers from a serious flaw. What if two players have outstanding seasons? The Quality of Victory formula will show that no one performed in a dominant manner, because the second-leading player's average is so high. This is not fair, nor is it accurate.

Goal z-scores (GZ) and assist z-scores (AZ) were designed to resolve this problem. It was hoped that they would not create any new problems; unfortunately this is not the case (more on this later). What we do is compare a player's performance to two things: the average individual player performance that year (in terms of goals per game or assists per game), and the degree of variation in individual player performance that year. Standard deviation is a way to measure this variability. For instance, the sets {1,2,3,4,5} and {0,1,3,5,6} have the same mean (5.0), but the second set has more variation, and therefore a higher standard deviation (2.5, compared to 1.6 for the first set). A z-score is simply the number of standard deviations an observation is above the mean (or below the mean in the case of a negative z-score). So, if we have a set of numbers whose mean is 5 and whose standard deviation is 3, then an observation of 8 would have a z-score of 1 ((8 - 5)/3). It's that simple.

In a normal distribution of events, about two-thirds of all observations will fall within one standard deviation of the mean (i.e., have a z-score between -1 and 1). 95% of observations will be within two standard deviations (z-scores between -2 and 2), and almost all will be within three standard deviations (z-scores between -3 and 3). Using z-scores we can determine how outstanding an individual performance was. For instance, only an outsanding season would produce a z-score of 3 or more.

That was the set-up. As it turns out, the results of this study are not that interesting; but what the results indicate may be of interest. The problem with the z-scores is that the top seasons of all time are dominated by recent players. For instance, in the top 40 GZ seasons, we have 5 from the 2000's (in only three years), 17 from the 1990's, 10 from the 1980's, four from the 1970's, and two each from the 1960's (Bobby Hull) and the 1930's (Charlie Conacher). So really is shows only the best of recent seasons. The assist results were predictable; Gretzky has the top 10 almost to himself, with Lemieux following. The top goal results are interesting enough to note (minimum 20 games played):

 Rank/Player  Year  GP  GZS
 1. Brett Hull  1991  78  5.95
 2. Wayne Gretzky  1984  74  5.86
 3. Mario Lemieux  1993  60  5.82
 4. Cam Neely  1994  49  5.80
 T5. Mario Lemieux  1989  76  5.64
 T5. Mario Lemieux  1996  70  5.64

So Brett Hull's 1991 campaign, while technically falling short of Gretzky's goal record, is actually more impressive than any of Gretzky's goal-scoring seasons by this analysis. But the real king of the list is Lemieux. In addition to spots 3, 5, and 6, he holds down numbers 11, 17, and 30 on the top 40 (as well as #41). Gretzky has #2, 12, 34 and 38. No contest.

But as I said, the results aren't overly interesting, because they are dominated by recent players. But the fact that recent players dominate is in itself interesting. It indicates that modern players are able to dominate the average players by a larger degree than older players. The cause of this is unclear, as it can be affected by the performance of the top players, as well as what constitutes an "average" player. But it's interesting because it's the exact opposite of what has happened in baseball, where the degree of domination by the top players has decreased over time, rather than increased. Food for thought.

Friday, 19 September 2014

Puckerings archive: Point Allocation (09 Apr 2002)

What follows is a post from my old hockey analysis site puckerings.com (later hockeythink.com). It is reproduced here for posterity; bear in mind this writing is over a decade old and I may not even agree with it myself anymore. This post was originally published on April 9, 2002.
 

Point Allocation
Copyright Iain Fyffe, 2002


For those who may not know, Bill James is quite a brilliant man. He's known primarily, of course, for his work as a stathead in baseball. It has become fashionable of late (especially amongst younger statheads) to decry James' work. I'll not get into that; I'll just say this: his pure writing about baseball is arguably more impressive and engaging than his statistical work about baseball. Even if he wasn't a brilliant statistician, his work would still be invaluable to any fan who considers himself to be knowledgable about the game.

Fortunately for us hockey folk, some of James' work and ideas can be translated for use in hockey, or can at least be used for inspiration. This paper describes the development of the Point Allocation system, which is a method of evaluating players based on their contributions to their team's success (or lack thereof). It is based on two bits of Bill James; I have adapted his Marginal Runs analysis, which forms the basis for his Win Shares system of player evaluation, and I have extrapolated quite a bit from a fairly casual remark he made in The Politics of Glory.

I'll start with that relatively innocuous comment. James was discussing the common assertion that defence is not reflected well in statistics (at least, the statistics that most people talk about). He pointed out that, to a degree, defence is in fact reflected, through the length of the player's career, and the amount that he plays. For instance, Brooks Robinson was not a particularly great hitter. And yet, he had an exceptionally long career. Why? You know the answer: he was probably the greatest defensive third baseman who ever lived.

Epiphany! Look at this: Guy Carbonneau and Bob Gainey (for example), both with unimpressive offensive totals, both with long careers. Both renowned as defensive players, even if the stats (like plus-minus) "don't show it".

Epiphany again! Maybe we can take this concept down to a team-season level. That is, say we have two players on the same team, who contribute the same offensively, on a per-minute basis. One player plays 15 minutes per game, the other plays 18 minutes per game. Since these players are offensive equals, there can be only one explanation for the discrepancy in playing time: defence. That second player's defence must be sufficiently better than the first's to warrant three extra minutes of playing time per game. More on this later. 

Marginal Goals

Now we move on to the basic ideas behind James' Win Shares system, which I have adapted to create the Point Allocation system. Win Shares is a way of distributing a team's wins amongst its players, based on their relative contributions to the team's success. The building block of Win Shares is Marginal Runs; therefore the building block for Point Allocation is Marginal Goals.
James discovered a new method that predicts team success similarly to his famous Pythagorean analysis (which itself has been adapted to hockey by Marc Foster). I'll explain it in hockey terms. The following formula is an excellent predictor of a team's winning percentage:

E(Pct) = (MGF + MGS) / (2 x AvgG)

Where E(Pct) is expected winning percentage; MGF is Marginal Goals For, calculated as the team's goals for less one-half the league-average goals per team; MGS is Marginal Goals Saved, calculated as one and one-half times the league- average goals per team less the team's goals against; and AvgG is the league-average goals per team.

Marginal Goals is no better at predicting winning percentage than Pythagorean analysis; in fact, it's probably slightly worse. However, what Marginal Goals allows us to do is apportion the team's winning percentage (in the form of points) between a team's offence and defence, as follows:

OP = MGF / TMG x Pts
DP = MGS / TMG x Pts

Where OP is offensive points (points attributable to offence); DP is defensive points (points attributable to defence); TMG is Total Marginal Goals (Marginal Goals For plus Marginal Goals Saved); Pts is the team's points (ties plus two times wins); and MGF and MGS are defined as above.

We simply cannot do this with Pythagorean analysis. Say we have two .500 teams in a league where 300 goals is average. Team A scores and allows 350 goals, while Team B scores and allows 250. Pythagorean analysis will tell us that both of these teams should both be at .500, which they are. But Team A's success is clearly tied to its offence, while Team B relies more on defence. Marginal Goal analysis allows us to determine how much success is attributable to offence and defence (in this example, Team A is 67% offence and 33% defence, and Team B is 33% offence and 67% defence).

Now if you're wondering why 0.5 and 1.5 are used, rather than some other numbers, it is because these produce a result where approximately one-half of all points will be attributed to offence, and one-half to defence. I know this to be true, as I have tested it; I just haven't proven it mathematically.

From here we depart from Mr. James' work. The next step is to allocate the OP and DP amongst the team's players, and the methods used in baseball's Win Shares are not transferable to hockey. So I have devised my own methods of doing so. As an illustrative example, I will go through the Point Allocation calculations for the 2000/01 Detroit Red Wings.

Team Analysis

The Red Wings played 82 games, collecting 107 points (note that I have eliminated points for OT losses, to keep consistency with the entire history of the NHL). In a league in which 226 goals was average, they scored 253 goals and allowed 202 goals. Therefore, their MGF was 140, and their MGS was 137, for a TMG of 277. Thus their 107 points are allocated as follows: 54.1 to offence, and 52.9 to defence.

Offence

Fortunately, hockey stats reflect offensive contribution quite well, through goals and assists. We cannot use scoring points, however, because of the arbitrary way in which they combine goals and assists. There is no reason to think that playmaking is 1.7 times as important as goalscoring (which is what scoring points do in modern times, where there are about 1.7 times as many assists as goals). Since there is no way to determine the relative importance of playmaking and goalscoring, we will assume they are equally important.

Therefore, to allocate offensive points, we need to calculate a new stat, Offensive Contribution (OC), which is simply defined as the player's assists divided by the team average assists per goal, plus the player's goals. For instance, Brendan Shanahan had 31 goals and 45 assists in 2000/01, and the Wings had 1.70 assists per goal. Shanahan's OC is therefore 57 (45/1.7 + 31). Doing this for every Red Wing, we find the team total is 509 (which is twice their goals, with a rounding difference). Shanahan's OC is .112 of the team total; therefore he receives 6.1 Offensive Points (OP), which is .112 of the team's points allocated to offence. Note that in this analysis, goals and assists by goaltenders are ignored. A goaltender's value lies in stopping pucks, not in shooting the puck down the ice into an empty net. Similarly, goalie assists are more a function of team offence, and have little to do with the goalie's skill.

Defence

Defence in hockey is made up of two parts: the skaters who attempt to prevent shots, and the goaltender who attempts to stop those shots that are allowed. Therefore, before we can allocate defensive points amongst a team's players, we need to determine how many go to team defence, and how many go to goaltending.

Since we have defined a defence's job as preventing shots, and a goalie's job as stopping shots, we will use team shots against and goalie save percentage in conjunction with marginal goal analysis to allocate points.

We start with team defence. The defence is responsible for preventing shots. Therefore, we calculate the MGS you would expect for the team based on their actual shots allowed. Detroit allowed 2221 shots in 2000/01, and the NHL average scoring percentage was 9.95%. Therefore we would expect Detroit's defence to have a MGS of 118 ((226 x 1.5) - (2221 x .0995).

Now we move on the goaltending. Since a goalie's job is to stop shots, we evaluate them based on their save percentage. We calculate the MGS we would expect for the goalies based on their save percentage, and then calculate a weighted average for all the team's goaltenders, based on the goalies' playing times. Manny Legace had a .920 save percentage in 2000/01. The NHL average shots per game (excluding empty-net shots) was 2265, and the league average goals (excluding empty-netters) was 219. Legace's MGS is therefore 148 ((219 x 1.5) - (2265 x (1 - .920)). Similarly, Chris Osgood's MGS is 109 based on his .903 save percentage. Legace played 2136 minutes, and Osgood played 2834; the weighted-average MGS for Detroit's goaltending is therefore 126.

We now need to combine these two figures. There are, on average, 4.9 skaters and one goaltender on the ice at any one time. Therefore we will assume that the skaters' value is 4.9 times as important as the goaltending value. Therefore, the DP are distributed as follows:

DPS = DP x MGSS /(4.9 x MGSS + MGSG)
DPG = DP x MGSG /(4.9 x MGSS + MGSG)

Where DPS is Defensive Points allocated to skaters, DPG is Defensive Points allocated to goalies, DP is team Defensive Points, MGSS is the MGS value for skaters, and MGSG is the MGS value for goalies.

For Detroit, this works out to 9.5 for goaltenders and 43.4 for skaters.

Allocating DP to Goaltenders

Allocating the goaltending DP among the team's goaltenders is a simple task. Simply take each goaltender's contribution to the team weighted-average MGS to determine the proportion of DPG he receives. For instance, Detroit's MGSG of 126 was made up of 64 from Manny Legace (148 MGS time his proportion of minutes played) and 62 from Chris Osgood. Therefore, Legace receives 50.8% of the DPG (64/126), or 4.8 points. Osgood receives the remaining 4.7 points.

Allocating DP to Skaters

Skaters' defence is such an ephemeral quality; we all know that the stats don't reflect defence in any meaningful way. But wait! Remember what I was discussing before I got into all of this; a player's defensive value is reflected in his playing time, when his offence and his teammates are taken into consideration. Just like Bill James' breakthrough in fielding analysis, we start at the team level. It's probably easiest to explain by diving right into the illustration.

First, we must assume that a skater's job is made up of equal parts offence and defence, on average. Then we also assume that a player's total value to a team is reflected in his playing time; that is, a team's best players will play the most. I believe these are perfectly logical and safe assumptions. We then compare each player to the average offensive numbers for his position (forward and defence) to find his offensive contribution relative to the team average. Comparing this to his actual playing time, we can estimate his defensive value to the team.

Let's look at some numbers. Detroit's forwards, in total, played 14,250 minutes in 957 games, for an average of 14.89 minutes per game. They had a total OC of 386. The average OC for a Detroit forward was therefore 0.40 per 14.89 minutes.

Now let's look at Brendan Shanahan, Detroit's top scorer in terms of points. He played an average of 18.37 minutes per game, and had an OC per 14.89 minutes of 0.57. His OC was 1.425 times that of an average team forward; if playing time depended only upon offence, we would thus expect him to play 21.22 minutes per game (1.425 x 14.89). But he played only 18.37 minutes per game; 2.85 minutes per game less than the offensive expectation. This difference must be due to his defence, which is obviously not as good as his offence. His defensive minutes per game would be 18.37 minus 2.85, or 15.52; since offence and defence are equally important, this will give us his average playing time of 18.37 per game. If playing time were based solely on defence, Shanahan would probably play about 15.52 minutes per game. He played 81 games, so his total defensive minutes would be 1,257 (15.52 times 81).

We do this for each player in turn. Note that for defencemen, the values for the Red Wings defencemen must be used (9,793 minutes in 512 games, 19.13 minutes per game, 0.24 OC per 19.13 minutes). Also note that it is possible that a player's calculated defensive time would be negative (though usually only for a player playing only a few games). Since we are dealing only with marginal contributions, negative values make no sense. Therefore, any negative value is assumed to be zero in all analysis involving Marginal Goals, and throughout the Point Allocation system. Adding up the minutes, we find Detroit's team total to be 25,209 defensive minutes. We use this total to allocate defensive points to skaters, based on their proportionate contribution to the total. Shanahan had 1,257 of the team's 25,209 defensive minutes, or 0.050 of the total. He therefore receives 0.050 of the 43.4 skater defensive points, or 2.2 points. Adding these to his 6.3 offensive points, we find his total is 8.5 points.

A Final Adjustment

We want this method to be applicable across all years for which the data is available. We don't want any distortion from schedule length or roster size to affect the results. Therefore adjustments are included, to normalize the results to an 80-game schedule, and also to 15 minutes per game for forwards and 20 minutes per game for defencemen. For example, Shanahan played 81 of 82 games; we adjust this to an 80-games schedule, so we give Shanahan 79 GP. He played 18.37 minutes when the average was 14.89; we adjust this for an average of 15, so we credit him with 18.51 minutes per game. So his total minutes are now 1,462 (79 times 18.51), instead of the 1,488 minutes he actually had. We then adjust his OP and DP based upon this adjusted minutes value. These adjustments will eliminate any bias when comparing today's players to players from the days when they played 70 games per year, or when only 17 skaters were allowed to dress. Similarly, goalies' minutes are adjusted to a base of 4800.

Here are the complete team results for the 2000/01 Red Wings. GP is adjusted GP, MIN is adjusted minutes, OP is offensive points (adjusted to MIN), DP is defensive points (adjused to MIN), and TPA is Total Points Allocated (the sum of OP and DP). 

 Name  Pos  GP  MIN  OP  DP  TPA
 Lidstrom  D  80  2379  5.4  3.7  9.1
 Fedorov  F  73  1550  5.9  2.9  8.8
 Shanahan  F  79  1462  6.2  2.2  8.4
 Lapointe  F  80  1297  4.9  1.9  6.8
 Yzerman  F  53  1187  4.1  2.5  6.6
 Kozlov  F  70  1038  3.3  1.6  4.9
 Legace  G  2063  4.8  4.8
 Osgood  G  2737  4.7  4.7
 Maltby  F  70  1007  1.8  2.5  4.3
 Draper  F  73  987  2.0  2.2  4.2
 Gill  D  66  1284  0.9  3.3  4.2
 McCarty  F  70  947  2.0  2.1  4.1
 Verbeek  F  65  884  2.7  1.4  4.1
 Ward  D  71  1261  0.8  3.3  4.1
 Holmstrom  F  71  835  3.2  0.5  3.7
 Larionov  F  38  699  2.1  1.5  3.6
 Murphy  D  56  1112  1.4  1.9  3.3
 Dandenault  D  71  1194  2.1  0.9  3.0
 Duchesne  D  53  1015  1.9  1.0  2.9
 Fischer  D  54  946  0.7  2.2  2.9
 Brown  F  59  668  1.9  0.9  2.8
 Gilchrist  F  59  693  0.7  1.9  2.6
 Devereaux  F  54  550  1.0  1.1  2.1
 Chelios  D  23  549  0.2  1.7  1.9
 Butsayev  F  15  138  0.2  0.3  0.5
 Kuznetsov  D  24  237  0.2  0.3  0.5
 Williams  F  5  62  0.2  0.1  0.3
 Wallin  D  1  4  0.0  0.0  0.0

So here we have objective evidence that Nicklas Lidstrom is, in fact, Detroit's most valuable player. This surprises no one, I imagine. It is worth noting, however, based on the sampling of team calculations I have thus far made, that it is fairly rare for a defenceman to be a team's MVP (i.e., to have the highest TPA). This may seem to indicate that the system has a bias against defencemen. But I'm not sure this is true. A defenceman's job is primarily defence, and is therefore primarily passive. A defender reacts to an opponent's offense. Therefore, he has less control over his defensive contribution than an attacker has over his offensive contribution. This is reflected in the numbers, where DP tend to be flatter in distribution than OP. So while TPA indicates a team's MVP quite clearly, remember that it is not entirely fair to compare forwards and defencemen directly, since their jobs are so different.

Note how the system also provides objective evidence of the defensive prowess of the Maltby-Draper-McCarty line. Each have a DP total greater than their OP, which is fairly rare for a forward.
For comparison's sake, here are the 1975/76 Montreal Canadiens, one of the greatest teams ever iced. The ice times are estimates calculated using my method for estimating ice time.

 Name  Pos  GP  MIN  OP  DP  TPA
 Lafleur  F  80  1709  8.7  3.8  12.5
 Dryden  G  3580  11.8  11.8
 Mahovlich  F  80  1524  6.9  3.3  10.2
 Shutt  F  80  1412  5.8  3.2  9.0
 Lambert  F  80  1362  4.7  3.5  8.2
 Lapointe  D  77  2048  4.2  3.9  8.1
 Savard  D  71  1760  2.7  4.4  7.1
 Cournoyer  F  71  1097  4.8  1.7  6.5
 Risebrough  F  80  1104  3.0  2.9  5.9
 Lemaire  F  61  991  3.6  2.0  5.6
 Robinson  D  80  1574  2.3  3.1  5.4
 Awrey  D  72  1276  0.6  4.6  5.2
 Gainey  F  78  1021  2.0  3.1  5.1
 Jarvis  F  80  941  2.0  2.6  4.6
 Bouchard  D  66  1051  0.7  3.4  4.1
 Wilson  F  59  763  2.3  1.7  4.0
 Tremblay  F  71  771  1.8  1.9  3.7
 Roberts  F  74  783  1.6  2.1  3.7
 Larocque  G  1220  3.1  3.1
 Van Boxmeer  D  46  672  1.1  0.4  1.5
 Nyrop  D  19  326  0.2  1.2  1.4
 Chartraw  D  16  233  0.3  0.5  0.8
 Goldup  F  3  21  0.0  0.1  0.1
 Shanahan  F  4  25  0.0  0.1  0.1
 Andruff  F  1  10  0.0  0.0  0.0

Friday, 12 September 2014

Puckerings archive: Harmonic Points (08 Apr 2002)

What follows is a post from my old hockey analysis site puckerings.com (later hockeythink.com). It is reproduced here for posterity; bear in mind this writing is over a decade old and I may not even agree with it myself anymore. This post was originally published on April 8, 2002.


Harmonic Points
Copyright Iain Fyffe, 2002


The way it is now, assists are more important than goals in determining scoring championships. Why do I say this? Because for every goal, there are 1.7 assists awarded. Therefore, playmakers have an advantage over goal-scorers, because there are more assists for them to get a piece of. This is not fair. There is absolutely no evidence that playmaking is more important than goal-scoring in terms of scoring goals.

Total Hockey's Adjusted Scoring stats account for this somewhat, by using historic assist rates, which are lower than current rates. But it does not go far enough. Since there is no evidence to indicate which of goal-scoring and playmaking is more important, it is only fair to assume that they are equally important. Thus, when determining a "scoring champion", we should adjust the number of assists to equal the number of goals, on a league-wide basis.

More to the point, I believe we can further refine how we decide who is a "champion" scorer. For instance, say we have three players, all of whom have 80 adjusted scoring points. Player A has 25 goals and 55 assists, Player B has 40 goals and 40 assist, and Player C has 55 goals and 25 assists. I contend that Player B is the superior scorer. Why? Because he is less reliant on other players to produce goals. Player A is a playmaker; if he has no one of talent to pass to, his scoring will suffer. Player C is a goal-scorer; he needs a playmaker to maximize his value. Player B is a more complete player; he is less reliant on teammates, and is therefore a superior individual player.

I do, of course, realize that hockey is a team game, and it takes an entire team to win. But when we are assessing individual players, we should remove the effect of his teammates as much as possible. In this case, we do this with the Harmonic Points system (HP).

HP is based on the mathematical concept of the harmonic mean. The harmonic mean of two numbers is a middle number such that by whatever part of the first term the middle term exceeds the first term, the middle terms exceeds the second term by the same part of the second term. Whew! In other words, if the harmonic mean is 20% (of the lesser term) greater than the lesser term, it will be 20% (of the greater term) lower than the greater term. Still confused? Maybe a numerical example will help.

Take two numbers: 100 and 200. The harmonic mean of these numbers is 133. 133 is 33% (of 100) greater than 100, and 33% (of 200) less than 200. 

I won't keep you in suspense any longer. Here's how to compute HP (which is simply the harmonic mean of goals and assists, times two):

HP = 2 x {(2 x G x A) / (G + A)}

Where HP is Harmonic Points, G is goals, and A is assists. The formula is multiplied by two to retain the "look" of the number of points, since we're taking an average of goals and assists. A player who has an equal number of adjusted goals and adjusted assists will have HP equal to his adjusted points.

In applying HP, I have used Total Hockey's Adjusted Scoring statistics. This is to eliminate much of the bias created by a player's time and place, allowing us to compare players from different eras. In addition, I will be indicating Adjusted Games Played (games played divided by length of schedule times 82), which are not disclosed in Total Hockey, but should be.

But using the idea that playmaking and goal-scoring are equal in importance, we cannot use Adjusted Scoring stats as they are. Adjusted Assists are based on historic assist rates, which, of course, are higher than historic goal rate. So I have adjusted Adjusted Assists to use the same base figure as goals.

Here are the single-season NHL leaders in HP per 82 Adjusted Games Played (minimum 20 AGP), from 1917/18 to 2000/01. There have been 34 100-HP pace seasons in NHL history:

 Rank  Name  Club  Year  AGP  HP  Per 82
 1.  Howie Morenz  Montreal  1927/28  80  145  149
 2.  Mario Lemieux  Pittsburgh  1992/93  59  103  143
 Mario Lemieux  Pittsburgh  1995/96  70  122  143
 4.  Wayne Gretzky  Edmonton  1983/84  76  128  138
 Mario Lemieux  Pittsburgh  1988/89  78  131  138
 6.  Wayne Gretzky  Edmonton  1981/82  82  129  129
 7.  Wayne Gretzky  Edmonton  1984/85  82  127  127
 8.  Wayne Gretzky  Edmonton  1982/83  82  122  122
 Mario Lemieux  Pittsburgh  2000/01  43  64  122
 10.  Howie Morenz  Montreal  1930/31  73  108  121
 11.  Wayne Gretzky  Edmonton  1986/87  81  117  118
 12.  Phil Esposito  Boston  1970/71  82  115  115
 13.  Mario Lemieux  Pittsburgh  1987/88  79  110  114
 14.  Ralph Weiland  Boston  1929/30  82  113  113
 15.  Irvin Bailey  Toronto  1928/29  82  112  112
 Jaromir Jagr  Pittsburgh  1995/96  82  112  112
 17.  Jaromir Jagr  Pittsburgh  1998/99  81  110  111
 18.  Wayne Gretzky  Edmonton  1985/86  82  110  110
 19.  Mario Lemieux  Pittsburgh  1989/90  60  80  109
 20.  Phil Esposito  Boston  1973/74  82  108  108
 Mario Lemieux  Pittsburgh  1991/92  66  87  108
 22.  Mario Lemieux  Pittsburgh  1996/97  76  99  107
 23.  Phil Esposito  Boston  1971/72  80  103  106
 Wayne Gretzky  Los Angeles  1988/89  80  103  106
 25.  Phil Esposito  Boston  1968/69  80  102  105
 26.  Teemu Selanne  Anaheim  1998/99  75  95  104
 27.  Aurel Joliat  Montreal  1927/28  82  103  103
 28.  Ebbie Goodfellow  Detroit  1930/31  82  102  102
 Gordie Howe  Detroit  1952/53  82  102  102
 Jaromir Jagr  Pittsburgh  2000/01  81  101  102
 31.  Mario Lemieux  Pittsburgh  1993/94  22  27  101
 Eric Lindros  Philadelphia  1996/97  52  64  101
 33.  Wayne Gretzky  Los Angeles  1990/91  80  98  100
 Steve Yzerman  Detroit  1988/89  82  100  100

It's clear, by this analysis, that Mario Lemieux is the greatest offensive player in NHL history, bar none. His competition is, of course, Wayne Gretzky. Lemieux is on this list nine times to Gretzky's eight, but Lemieux also dominates the top of the list, appearing three times in the top five (to Gretzky's once), and seven times in the top 20 (to Gretzky's six). Lemieux is the only player with multiple 140-HP pace seasons (Gretzky never had one), and the only player with multiple 130-HP pace seasons (three, to Gretzky's one).

In terms of career HP per 82 AGP, there are four distinct classes of players: (1) Mario Lemieux, (2) Wayne Gretzky, (3) current stars in their prime, and (4) everyone else. Lemieux, through the 2000/01 season, has 1077 HP in 799 AGP, for a per-82 game figure of 111. No one else is even remotely close. Gretzky is second with an average of 96 (1805 HPP in 1543 AGP). Following these two are a bunch of players in the 80's, all current players in their prime: Eric Lindros, Jaromir Jagr, Teemu Selanne and Paul Kariya. Their averages will most likely drop over time to put them in the final group. The "everyone else" group is headed by Mike Bossy (75 average), Howie Morenz (73) and Phil Esposito (73). Other high averages belong to Gordie Howe, Jean Beliveau, Steve Yzerman, Joe Sakic, Marcel Dionne, and Bobby Hull.

The degree of separation between these classes of players serve to demonstrate how truly impressive Mario Lemieux's (and, to a lesser extent, Wayne Gretzky's) scoring exploits really are. These are the complete scorers, players who can carry a team's offence on their backs, all by themselves.

Friday, 5 September 2014

Puckerings archive: Does Playoff Experience Matter? (30 Oct 2001)

What follows is a post from my old hockey analysis site puckerings.com (later hockeythink.com). It is reproduced here for posterity; bear in mind this writing is over a decade old and I may not even agree with it myself anymore. This post was originally published on October 30, 2001 and was updated on April 9, 2002.
 

Does playoff experience matter?
Copyright Iain Fyffe, 2002


We all have heard that playoff experience is critical for playoff success. It's certainly been said often enough. If a team, or rather the players on a team, don't have enough playoff experience, they don't have a prayer of winning in the post-season. I believe it's time we put this idea to the test.

The assertion is this: teams with more playoff experience will be more successful in the playoffs than teams with less playoff experience. We will define success in the playoffs as the winning of playoff series, not necessarily winning the Stanley Cup. We will test the assertion through head-to-head playoff series matchups. If the assertion is true, then a team's relative playoff experience should be a good predictor of the outcome of the playoff series.

To test this assertion, I used data from the past three NHL seasons: 1998/99, 1999/00, and 2000/01. I defined a team's playoff experience as the total career playoff games played in previous years by all players who played for the team in that playoff year. I then used these total playoff experience figures as the sole factor in predicting the winner of each playoff series. That is, I predicted that the team with more total playoff experience would win each series. Here are the results of these predictions:

 Year  Series  Right  Wrong  Pct
 1998/99  15  10  5  .667
 1999/00  15  11  4  .733
 2000/01  15  9  6  .600
 Total  45  30  15  .667

There you have it. Playoff experience is a very good predictor of playoff success, being right two-thirds of the time. But not so fast; we need to go deeper than this superficial analysis. The problem with this analysis is that a player's playoff experience is not independent of the quality of his team (defined here as regular season points). That is to say, a player's playoff success depends greatly upon him playing for a good regular-season team; but don't take my word for it.
We start with two simple points: (1) good teams generally stay good from year to year, while bad teams stay bad, and (2) teams retain a majority of the same players from year to year. Before I continue, let me demonstrate that these points are true.

To demonstrate the first point, I will simply use correlation. The following are the correlation coefficients for NHL teams' regular season points between 1998/99 and 1999/00, as well as the correlation for points between 1999/00 and 2000/01.

 Years  Correlation
 1998/99-1999/00  0.67
 1999/00-2000/01  0.77

As demonstrated in the above table, last year's points are an excellent predictor of this year's points. A correlation of 0.60 or more is considered high, and the relationship is therefore very strong.

The second point is also simple to demonstrate. I selected a random sample of five teams to test the stability of their rosters. I compiled the regular season games played in 2000/01 for players on each team at the end of the year who were also on the same team at the end of the previous year (1999/00). I then compared these results to the maximum number of man-games, which is 18 skaters plus one goalie times 82 games, or 1558 man-games. Here are the results:

 Team  Games  % of Max
 Atlanta  1086  70
 Los Angeles  915  59
 New Jersey  1300  83
 Phoenix  1002  64
 Toronto  1124  72
 Average  1085  70

As you can see, the team you play for this year is most likely the team you played for last year. On average, 70% of a team's games are played by players who also played on the team at the end of the previous year.

Now that I have established these points, let's move on to this question: how good a predictor of playoff success is regular season success? I again tested playoff series for the past three years, this time using regular season points as the sole predictor of series winners. The 'neithers' in the table below are the result of teams having equal points, and therefore no winner being predicted. The results:

 Year  Series  Right  Wrong  Neither  Pct
 1998/99  15  10  4  1  .700
 1999/00  15  11  4  0  .733
 2000/00  15  9  5  1  .633
 Total  45  30  13  2  .689

As you can see, regular season success is marginally better than playoff experience at predicting playoff winners. What this really shows is that playoff experience has no apparent effect on the results of the playoff. If playoff experience were important, it would be better at predicting winners than regular season points. However, they're virtually identical as predictors. The reason for this is that playoff experience is accumulated through playing for a good team. I have shown that players generally play for the same team from year to year, that good teams are generally good from year to year, and that good teams are successful in the playoffs. Therefore, players on good teams will accumulate large totals of playoff experience not by "knowing what it takes to win in the playoffs," but by playing for a good team that will tend naturally to win more, both in the regular season and in the playoffs.

The crucial point is this: playoff experience is the result of playing for a successful regular season team. Playoff experience is simply a reflection of playing for a good team. There is absolutely no evidence that having greater playoff experience will affect the result of a playoff series. If playoff experience were important, it would be better than regular-season points in predicting playoff series winners; in fact, it's marginally worse. In reality, it's the quality of the team that matters, not the playoff experience of the players.
Hostgator promo codes