Volume 5, Number 1, September 2004

 

Bowie Kuhn's Worst Nightmare
 
 
James J. Cochran
Department of Marketing & Analysis
College of Administration & Business
Louisiana Tech University
Ruston, LA 71272
USA
 
 
 
 

Abstract

Students are presented with a narrative concerning the contentious negotiations between Major League Baseball (MLB) franchise owners and the Major League Baseball Players Association (MLBPA) over a new collective bargaining agreement in the early 1980s. Details of the 1981 players' strike, ensuing collective bargaining agreement, and strategy for resuming the interrupted season and facilitating the playoffs are provided. Pre-strike win/loss records and number of post-strike scheduled games are provided for each team. This case gives students great insight into the formulation of integer programming problems and interpretation of the resulting optimal solutions; in this case they must formulate constraint satisfaction problems with integer decision variables to assess the advisability of the MLB plan for resuming the season and revising the playoff structure to reflect the games lost to the strike (and the potential for an occurrence of Simpson's Paradox). As such, this case is also unique in its use of optimization to demonstrate a statistical phenomenon. Furthermore, although the case deals with a baseball-related problem, it is relatively self-contained and requires no understanding of how baseball is played, so students who are unfamiliar with the sport will not be seriously disadvantaged.

1. Introduction

Teaching applied quantitative methods (Statistics and Operations Research) can be daunting - many students are poorly prepared, have little prior knowledge or understanding of quantitative methods, and are anxious about the required mathematics. Furthermore, although applied quantitative methods are usually taught within the context of some area(s) of application, such courses are often taught before the students have a firm grounding in the chosen contextual area. By working within a context that is more familiar and interesting to students, an instructor of applied quantitative methods can increase her/his student's interest in and tolerance for the frustrating and complex concepts that comprise much of the course. The result is better understanding and retention by students of course material.

In addition to serving as a copious source of easily obtained and understood data, sports provides a context that is familiar and interesting to many students. Several instructors have reported great success using sports examples to motivate and illustrate various quantitative methods. Lackritz (1981) provides an early discussion of the use of sports data in the teaching of statistics. Lock (1997) uses data on point spreads, odds, and outcomes for five years of NFL regular season and playoff games to address a variety of statistics issues. Simonoff (1998) uses the 1998 baseball season and the dramatic competition between Mark McGwire and Sammy Sosa to break Roger Maris' single-season home run record to illustrate various types of exploratory analyses. Albright (1988) uses sequences of at-bats for MLB players to illustrate the types of exploratory and confirmatory analyses that can be performed to look for streakiness. Nettleton (1998) uses college basketball game scores and analyzes home-court advantage to illustrate introductory statistics concepts. MLB Hall-of-Fame election results and measurements of individual player performances are used by Cochran (2000) to teach descriptive statistics as well as methods for classification and discrimination. The strength of National Football League (NFL) teams is assessed using principal components methodology in a course taught by Watnik and Levine (2001).

Wiseman and Chatterjee (1997), Watnik (1998), and Cochran (2002) use sports economics examples to motivate students; Wiseman, Chatterjee, and Watnick use MLB player salaries in their introductory statistics courses, while Cochran uses annual MLB franchise attendance and team performance measures for project courses in linear models and econometrics.

Many authors have documented their use of sports examples to motivate the study of probability. Quinn (1997) and Starr (1997) each use the NBA draft lottery to illustrate various concepts in probability, while Quinn (1997) also uses anomalous sports performances to demonstrate probability concepts.

Several instructors teach courses (predominantly in probability and statistics) primarily or exclusively from a sports context. Cochran (2001) uses sports simulation board games to introduce a number of concepts in the probability segment of an introductory statistics course. Reiter (2001) teaches a special three-week course on statistics and sports. Gallian (2001) utilizes statistical concepts to analyze sports achievements and strategies in a freshman mathematics and sports seminar offered to liberal arts students. Albert (2002) describes a special section of an introductory statistics course where all course material is taught from a baseball perspective. McKenzie (1996) discusses his efforts to teach an applied statistics courses with a sports theme. Costa and Huber (2003) describe a one credit hour course that focusing on sabermetrics (defined by Albert in An Introduction to Sabermetrics as "mathematical and statistical analysis of baseball records"). Finally, Albert (2003) has published a sports-oriented textbook to be used in introductory statistics courses, and Ross (2004) has published a textbook to be used in a freshman seminar similar to the course taught by Gallian.

Although these examples are all taken from probability and statistics, it is worthy to note that several papers and books have been published on using operations research to analyze sports problems, including the series of seminal articles by Lindsey (1959, 1961, 1963). Ladany and Machol (1977) also edited an important early volume of papers on optimization strategies in sports. Use of such examples in the operations research classroom has not been as well-documented, but this special issue of INFORMS Transactions on Education serves as evidence that they do exist.

While sports examples are not a panacea for applied quantitative methods courses, these authors have found that most students (even those who are not knowledgeable about sports) enjoy and appreciate sports examples in quantitative methods courses if the context is carefully explained. Furthermore, anecdotal evidence suggests that a student who develops an understanding of a quantitative method through a sports example is relatively well-equipped to apply the technique in other familiar contexts later in her/his education and career.

The contentious negotiations between MLB franchise owners and the MLBPA over a new collective bargaining agreement in the early 1980s provide the backdrop for the case discussed in this paper. These circumstances led to MLB's first in-season work stoppage, a strike that lasted for fifty days and resulted in the loss of seven hundred and twelve regular season games. To deal with the loss of approximately one-third of the 1981 regular season, MLB Commissioner Bowie Kuhn, MLBPA Executive Director Marvin Miller, and several owners devised the following strategy: The regular season resumed as scheduled and no games lost to the strike were rescheduled. The team with the most pre-strike wins in each division was declared the 'first-half' division winner, while the team with the most post-strike wins in each division was declared the 'second-half' division winner. The first-half and second-half winners in each division were then to meet in a three game 'division playoff' to determine the ultimate division champion. If the same team won a division in both the first and the second halves of the season, its playoff opponent would be the division's post-strike second place team. This solution, obviously devised to simultaneously recognize teams for their pre-strike performances and provide incentive for all teams to be competitive during the post-strike period, quickly came to be known as the 'split season.' A student analyzing this case must address the potential pitfalls of the split season solution developed by Mr. Kuhn, Mr. Miller, and the owners. Specifically, the student must address two fundamental questions: Given the pre-strike results and number of post-strike games remaining, is it possible for a team to:

i) win neither the pre- or post-strike division title but have the best overall (combined pre- and post-strike) win/loss record in its division?

or

ii) win both the pre- and post-strike division titles but have an inferior overall (combined pre- and post-strike) win/loss record relative to a division rival?

This case has been used successfully in an undergraduate introductory mathematical programming course offered for business students majoring in management science. This course is the students' first exposure to mathematical programming and provides an overview of mathematical programming (linear, integer, goal, and nonlinear programming; network flow problems; combinatorial optimization) with a strong emphasis on modeling and formulation. The case has also been used with similar success in an MBA core introductory operations research course. This case-based course provides a less extensive overview of mathematical programming (linear, goal, and integer programming, network flow problems) and some coverage of basic stochastic models (decision trees, queuing, Markov chains, simulation) with a strong emphasis on modeling and formulation. In both courses, students are encouraged to discuss the case in groups of 2-3 but are required to submit individual two-page analyses. Students in both courses have spent several weeks learning about mathematical programming and using Solver© on such problems prior to assignment of the case.

As prescribed by Cochran (2000), the case is due in both of these classes immediately prior to an examination on mathematical programming and is the basis of a class discussion that is intended to help students prepare for the ensuing examination. Because these courses and the examinations emphasize modeling and formulation, this case provides strong support for the course objectives. Although the case presents a challenging formulation for students at either of these levels, it is not beyond their capability (particularly when students are encouraged to discuss the issues in small teams as they work through the case). Encouraging students to discuss the case in small teams also mitigates the potential difficulty presented by students who have little background or interest in baseball; the case assignment takes on the flavor of a typical business assignment in which a team of employees with widely different backgrounds are expected to work cooperatively to solve the problem. Presenting the assignment to the students in these terms and devoting a class meeting to discussion of the case prior to the exam effectively diffuses anxiety among non-sports minded students with regard to the case. The author's experience suggests that many these students actually enjoy the opportunity to learn a little about the baseball industry and/or see the sport from a different perspective. Finally, although the case deals with a baseball-related problem, it requires no understanding of how baseball is played; terms that are idiosyncratic to baseball or sports are carefully defined or explained in the introductory paragraphs, so the case is relatively self-contained - explicitly pointing this out also serves to allay non-sports minded students' anxieties about this case.

Faculty can also direct students to several websites that are useful for the analysis of this case. A few potentially useful sites include:

Of course, this is not an exhaustive list. Thousands of web sites are devoted to various aspects of baseball and could potentially be of interest; this wealth of readily available information is one of the several interesting aspects of this case.

2. The Salient Issues

This case raises two principal issues: Given the pre-strike results and number of post-strike games remaining, is it possible for a team to:

iii) win neither the pre- or post-strike division title but have the best overall (combined pre- and post-strike) win/loss record in its division?

or

iv) win both the pre- and post-strike division titles but have an inferior overall (combined pre- and post-strike) win/loss record relative to a division rival?

The ramifications are clear - if either of these events were to occur, fans would question the fairness and legitimacy of the split-season solution. If the split-season solution gave fans reason to question the legitimacy of competition, the loss of perceived integrity that MLB would suffer immediately after its exceedingly acrimonious first in-season work stoppage would be devastating for the baseball industry. In analyzing this case, the student must understand how the win/loss records of baseball teams are evaluated, i.e., on what criteria are teams ranked. Most students who are unfamiliar with baseball will immediately focus on teams' win/loss percentages, defined for some Team A as

However, the common criteria for comparing win/loss records of two baseball teams A and B is the number of games Team B is behind Team A in the standings. The number of games Team B is behind Team A in the standings (from this point forward referred to as games in the standings) is defined as

These definitions are provided in the case; however, they could be removed and students could be expected to find them on their own (the author has done so when assigning this case to encourage students to develop some skill in performing background research; students can find this definition on several websites including NetShrine's "Baseball Statistics Glossary" ).

Students must be made to understand that these criteria will not always yield equivalent standings (ranking of teams) unless all teams play an equal number of games (and that most fans don't understand that this could happen!). Because all franchises are scheduled to play an equal number of regular season games (162), it is obviously impossible for a team with a lower winning percentage than another team to be superior in terms of games in the standings at the conclusion of the season if both teams complete their schedules. However, the 1981 work stoppage and resulting split schedule left teams with unequal numbers of games in both the pre- and post-strike periods (as well as over the entire season). Many students will have difficulty accepting that it is possible for these two criteria to result in different standings under any circumstances; the instructor can provide a simple example such as the following to convince these skeptical students:

Consider two teams and their win-loss records, Team A with seven wins and three losses and Team B with four wins and one loss. Team A is ½ game ahead of Team B…

…but Team B (0.800) has a better win/loss percentage than Team A (0.700).

This relatively trivial example (in conjunction with the students' initial skepticism) demonstrates why the potential for such an occurrence (and the public relations debacle that would certainly ensue) should have been a concern of MLB during the 1981 season and underscores why it is important to consider both win/loss percentage and games in the standings when addressing the issues in this case.

3. The Analyses

Using both win/loss percentage and games in the standings, we now consider the two questions

i) Can a team win neither the pre- or post-strike division title but have the best overall (combined pre- and post-strike) win/loss record in its division?

and

ii) Can a team win both the pre- and post-strike division titles but have an inferior overall (combined pre- and post-strike) win/loss record relative to a division rival?

In modeling these questions, students generally define decision variables similar to the following:

wi,j is the number of games won by team i in period j, j=1 (pre-strike), 2 (post-strike)

Given these decision variable definitions, other values can easily be defined:

gi,j is the number of games played by team i in period j
li,j = gi,j - wi,j is the number of losses suffered by team i in period j

3.1 Can a team win neither the pre- or post-strike division title but have the best overall (combined pre- and post-strike) win/loss record in its division?

If games in the standings is used as the criterion, a pair of constraints must be constructed to force two different teams (say i=1 and i=2) to finish ahead of a third team (i=3) in the pre- and post-strike periods, respectively. These constraints could look like

Similarly, a pair of constraints must be constructed to force teams 1 and 2 (who had superior pre- and post-strike records, respectively) to each have an inferior overall (combined pre- and post-strike) win/loss record in terms of games in the standings relative to team 3. These constraints could look like

The right-hand side for each of these constraints is set to 0.50 because that is the smallest possible advantage in games in the standings that one team can have over another team (determining this is another interesting challenge for students − many have difficulty with this aspect of the case).

Also note the number of wins for team i in period j must be a nonnegative integer that does not exceed the number games played by team I in period j:

The resulting feasible region is given by

Students generally chose one division on which to initially focus their efforts - most chose to concentrate on the NL (NL) West Division because it had tightest pre-strike standings (the Cincinnati Reds finished 0.5 games behind the Los Angeles Dodgers). Furthermore, it is logical to select the team in the chosen division with the worst pre-strike record as the team that will (hypothetically) finish with a better post-strike record than Cincinnati. In this instance, that team is the San Diego Padres (with twenty-three wins and thirty-three losses). Thus, i=1 represents the Los Angeles Dodgers, i=2 represents the San Diego Padres, and i=3 represents the Cincinnati Reds in the National League West Division model.

Once a student selects a division on which to initially focus, the feasible region can be simplified by recognizing that the constraint forcing team 1 to finish with a better pre-strike record than team 3 can be eliminated (because the pre-strike standings are known). The feasible region can be further simplified through substitution of known values for i) number of pre-strike wins by each team (w1,1=36, w2,1=23, w3,1=35), ii) number of pre-strike games played by each team (g1,1=57, g2,1=56, g3,1=56), and iii) number of post-strike games played by each team (g1,2=53, g2,2=54, g3,2=52). The simplified feasible region is

At this point, the student must recognize that integer programming is actually being used in this case to determine feasibility and not optimality. Understanding this point is crucial to the development of an objective function for this formulation; it provides great latitude in the choice of objective function. One could, for example, chose to optimize (either maximize or minimize) the number of post-strike games played or won by any of the three teams.

This formulation is feasible, which implies that Cincinnati (or some other MLB team) could win neither the pre-or post strike division title and still have the best overall (combined pre- and post-strike) win/loss records in terms of games in the standings in their division.

A similar formulation can be used if win/loss percentage is used as the criterion. Given a sufficiently small value ε, the associated feasible region (after simplification) is given by

Again, students have great latitude in the choice of objective function. One could again chose any of the objective functions considered for the previous formulation (as well as several others). Furthermore, the question of the appropriate size of ε and the effect of the size of ε on the solution time is interesting in its own right - this creates another opportunity for a provocative class discussion.

While this question certainly can (and will) be modeled by many students, it is relatively simple to answer without use of any modeling - a student can logically determine that this could easily occur (in terms of either win/loss percentage or games in the standings) if one team finished a close second a different team in each segment of the split schedule. As a student who does independent reading on the 1981 season will find, this actually happened in both NL divisions:

Table 1. Pre-strike, Post-strike, and Combined win/loss Records
Pre-strike, Post-strike, and Combined win/loss Records

In terms of games in the standings, the Philadelphia Phillies won the pre-strike NL East Division title (by 1.5 games over the St. Louis Cardinals) and the Montreal Expos won the post-strike NL East Division title (by 0.5 games over St. Louis), while St. Louis had a superior overall record (by 2.0 games over the Montreal Expos). Similarly, in the NL West Division the Los Angeles Dodgers won the pre-strike division title (by 0.5 games over the Cincinnati Reds) and the Houston Astros won the post-strike division title (by 1.5 games over Cincinnati), while Cincinnati had a superior overall record (by 4.0 games over Los Angeles). Despite having the best record of any MLB team in 1981, Cincinnati did not qualify for the playoffs! The results were similar with regards to win/loss percentage.

3.2 Can a team win both the pre- and post-strike division titles but have an inferior combined split season win/loss record relative to a division rival?

This question is far more challenging - it is more difficult to answer without modeling and no examples of this result occurred during the 1981 season. Additionally, students must recognize that they can use the same variables and values wi,j and gi,j defined in Section 3.1, but they only need to consider two teams (a team that wins both the pre-and post-strike division titles and a team that doesn't win either title) to model this question.

3.2.1 Can a team win both the pre- and post-strike division titles but have an inferior combined split season win/loss record in terms of games in the standings relative to a division rival?

When using games in the standings as their criterion, students generally let i=1 for the team that wins both the pre-and post-strike division titles and i=2 for the team that doesn't win either title but has the superior combined split season win/loss record relative record, and then develop an objective function of some form similar to:

This objective function allows for the determination of the existence of a combined split-season advantage in games in the standings of a team who did not win the pre-strike division title (i=2) over the team who did win the pre-strike division title (i=1). Note that the objective function can be simplified for this problem because the values of wi,1 and gi,j are each known for all i and j.

In order to force a post-strike advantage in games in the standings for the team that won the pre-strike division title (i=1) over a team that did not win the pre-strike division title (i=2), the following constraint is constructed:

The right-hand side of this constraint is again set to 0.50 because that is the smallest possible advantage in games in the standings that one team can have over another team.

Finally, the number of wins for team i in the post-strike period must be a nonnegative integer that does not exceed the number of games played by team i during the post-strike period. The resulting formulation is

Suppose we formulate this problem for the NL West division (the division with the closest pre-strike standings) and let i=1 represent the Los Angeles Dodgers (who won the pre-strike division title) and i=2 represent the Cincinnati Reds (who finished second in the division in the pre-strike standings). The formulation that results after substituting known values for i) number of pre-strike wins by each team (w1,1=36, w2,1=35), ii) number of pre-strike games played by each team (g1,1=57, g2,1=56), and iii) number of post-strike games played by each team (g1,2=53, g2,2=52) simplifies to

In this case, the constants in the objective function (values of wi,1 and gi,j) completely cancel each other. If a constant remained in this objective function after simplification (possibly when formulating this problem for another division), the student could chose to retain the constant because its presence enables a quick determination on whether is it possible for a team to win its division in both portions of the split season but still not have the best overall (combined pre- and post-strike) win/loss record in its division in terms of games in the standings (a positive feasible value for this objective function indicates that such an occurrence is possible).

This problem, which can easily be solved by inspection, yields several alternate optima of the form

each with an associated objective value of -1.0 (note that w1,2 {1,2,…, 53} so that w2,2 ≥ 0). This result indicates that, given the pre-strike results and number of post-strike games remaining, it is not possible for Los Angeles to win its division in both portions of the split season and have a worse overall (combined pre- and post-strike) win/loss record in terms of games in the standings relative to Cincinnati.

The optimal objective values for the formulations associated with each of the other divisions (NL East, American League East and West) are also negative. These results demonstrate that, given the pre-strike results and number of post-strike games remaining, it is not possible for any MLB team to win both the pre- and post-strike division titles but have an inferior overall (combined pre- and post-strike) win/loss record in terms of games in the standings relative to a division rival.

There is, however, another issue lurking in this case - due to inclement weather, some scheduled games may be cancelled (rainouts). Thus, an implicit assumption of the previous model regarding the number of post-strike games remaining may be false. If this assumption is relaxed, the gi,2 may be considered decision variables, and the salient question becomes Given the pre-strike results, is it possible under any circumstances (i.e, any values of gi,2) for a MLB team to win both the pre- and post-strike division titles but have an inferior overall (combined pre- and post-strike) win/loss record in terms of games in the standings relative to a division rival? In addition to constraining the post-strike positive advantage between the team who did win the pre-strike division title (i=1) and a team who did not win the pre-strike division title (i=2) to be positive, we must now add the following constraint

to force the team that did not win either the pre-strike or post strike division title (i=2) to have a positive overall advantage in games in the standings over the team who did win both the pre-strike and post strike division title (i=1). We also must now limit the number of games played by each team in the post-strike period (gi,2) so they are nonnegative integers that do not exceed the number of post-strike games scheduled for team i. If we define si,2 to be the number of post-strike games scheduled for team i, the resulting formulation is

The simplified formulation of this problem for the NL West (with s1,2=52, s2,2=53) is

If no student notices, the instructor should point out that the first two constraints combine to render this formulation infeasible - when the first constraint is multiplied by -1.0, the left hand side (which is now identical to the left hand side of the second constraint) is constrained to values no greater than -1.00. Thus, this constraint and the second constraint cannot be satisfied simultaneously.

The instructor can further elevate this discussion and demonstrate that it is impossible under any circumstances for a team to win both the pre- and post-strike division titles but have an inferior overall (combined pre- and post-strike) win/loss record in terms of games in the standings relative to a division rival. If we define combined season total wins and losses to be wi,T = wi,1 + wi,2 and li,T = li,1 + li,2, we have

If i = 2 represents the team with best overall (combined pre- and post-strike) win/loss record in its division in terms of games in the standings, the left-hand side of this expression (w1,Twi,2) + (l2,Tl1,T) must be negative. However, this can happen only if at least one of the two terms (w1,1w2,1) + (l2,1l1,1) or (w1,2w2,2) + (l2,2l1,2) on the right-hand side of this expression is negative, i.e., if the team represented by i = 2 wins either the pre- or post-strike division title. Here, students are given insight into the deep perspective that the simple act of modeling a problem can provide.

3.2.2 Can a team have both the best pre- and post-strike win/loss percentages in its division but still not have the best overall (combined pre- and post-strike) win/loss percentage in its division?

Given the pre-strike results and number of post-strike games remaining, the formulation of this problem is

where ε is again a sufficiently small value. If the number of post-strike games played by each team is fixed (i.e., no cancellations), this is a relatively simple linear integer programming problem. After substituting known values, the formulation for the NL West (where i=1 represents the Los Angeles Dodgers and i=2 represents the Cincinnati Reds) simplifies to

Again, the constant is left in the objective function to aid in interpretation - a positive feasible value of this objective function indicates it is possible for a team to have the best pre- and post-strike win/loss percentages in its division but still not have the best overall (combined pre- and post-strike) win/loss percentage in its division. The optimal solution of this problem is negative, indicating is it not possible for Los Angeles to have the best pre- and post-strike win/loss percentages in its division and also have a worse overall (combined pre- and post-strike) win/loss percentage than Cincinnati. Thus, Simpson's Paradox cannot occur in the NL West Division given the pre-strike results and number of post-strike games scheduled. Instructors can use this result to motivate a discussion on how Simpson's Paradox occurs when there is a large discrepancy in the number of events observed (games played) for the various categories (teams).

Finally, students must address the same issue under the condition that some post-strike games may be cancelled. Although this is a more general potential case of Simpson's Paradox, most students reflexively believe (incorrectly), given the previous results, they have already answered this question.

Key to this formulation is again recognizing that the games played by team i in the post-strike period (gi,2) is now a decision variable that must be integer and not exceed the number of games scheduled for team i in the post-strike period (si,2). Note that the values of the gi,2 must also be positive so the constraint forcing the team that won the pre-strike division title to have a superior post-strike win/loss percentage is defined. Thus, the student must augment the previous formulation with the constraint set

After substituting known values and setting ε= 0.0001, the formulation for the NL West (where i=1 represents the Los Angeles Dodgers and i=2 represents the Cincinnati Reds) simplifies to

The optimal solution (w1,2=1, w2,2=0, g1,2=53, g2,2=1) yields a positive objective value (0.27767), indicating that it is possible for a team to have the best pre-and post-strike win/loss percentages in its division and have an overall (combined pre- and post-strike) win/loss percentage inferior to another team in its division if some post-strike games are cancelled. The existence of Simpson's Paradox has been demonstrated. Again, instructors can note that the optimal solution occurs where the discrepancy in games played by the two teams (g1,2 and g2,2) is greatest and use these results to further a discussion on how Simpson's Paradox can occur when there is a large discrepancy in the number of events observed (games played) for the various categories (teams).

Upon noting that this is an extreme case (Los Angeles wins one of their fifty-three post-strike games, while Cincinnati loses their only post-strike game that isn't cancelled!), a student may arbitrarily choose what s/he consider more reasonable values for g1,2 and g2,2. Depending on the chosen values for g1,2 and g2,2, the student may or may not find other potential occurrences of Simpson's Paradox. These efforts may eventually lead a student to recognize that this problem can be linearized by iteratively fixing the number of post-strike games played by each of the teams (gi,2, i =1,2) and solving the resulting formulation. These results can be used to generate a contour plot of the number of post-strike games played by each team and the maximum difference in overall win/loss percentage between the team that does not have the best pre- or post-strike win/loss percentage in the division and the team that does best pre- and post-strike win/loss percentages in the division.

Contour Plot of Maximum Difference in Overall win/loss percentage by games played.
Figure 1. Contour Plot of Maximum Difference in Overall win/loss percentage by games played.

This plot would be relatively easy for a student to produce using VBA in conjunction with Solver©. A student who produces this plot can respond intelligently to the last question posed by the case (If any of these problems are possible, can you design a resolution that will give all teams a competitive incentive after the strike, justly reward or penalize teams for their pre-strike performances, and avoid the pitfalls of the MLB split-season strategy?). The student could suggest that MLB management use the plot to avoid an occurrence of Simpson's Paradox if similar circumstances arose - MLB must ensure the two teams play an appropriate number of games to put them in the light blue region (center crease) of the surface on the graph. If the teams in question are in danger of leaving the blue crease of the contour plot, MLB must reschedule some of the rained-out post-strike games (which is actually a MLB policy). This plot also provides an important insight into Simpson's Paradox - the potential for this phenomenon grows with the discrepancy in post-strike games played by two teams.

Students have also suggested several other creative strategies for avoiding the pitfalls of MLB's split-season scheme (crown divisional championships for the pre-strike and combined pre- and post strike periods, crown divisional championships for the post-strike and combined pre- and post strike periods) and can test each of these solutions with the same approach they use to test of MLB's split-season scheme. Students generally do come to the realization that there is no absolutely fair resolution to this problem (a very important conclusion).

A student could conceivably use other formulations to demonstrate that it is possible for a team to have the best the pre-and post-strike win/loss percentages in its division and have an overall (combined pre- and post-strike) win/loss percentage inferior to another team in its division if some post-strike games are cancelled. Optimization (minimization or maximization) of the number of post-strike games played (gi2) or won by either team (wi2) could be used as the objective. In addition to the constraints from the previous formulation, this formulation requires a constraint to force the team that did not win its division in either portion of the split season to have the best overall (combined pre- and post-strike) win/loss percentage in its division. For example, the following formulation

is feasible only if the team that did not win its division in either portion of the split season could have the best overall (combined pre- and post-strike) win/loss percentage in its division.

4. Conclusions

This case effectively requires the student to use integer programming to determine the feasibility of numerous anomalous circumstances regarding the 1981 MLB split season. Proving or disproving the plausibility of these circumstances becomes progressively more challenging, and the case culminates in proof of the existence of Simpson's Paradox (an important, counter-intuitive, and frequently misunderstood statistical concept). Thus, the case can be used to encourage students to integrate operations research and statistics, thereby helping them understand the common nature of various quantitative disciplines.

Many potential formulations, all of which are relatively simple, can conceivably be derived from this case. While most students will not develop formulations that address all potential issues presented in this case, the author's experience suggests that an instructor can expect most of the issues to be addressed by at least one student in a reasonable sized (at least fifteen students) course section. This provides an excellent forum for discussion of various formulations, their ramifications, and how they can be used to address the various issues of the case.

The author was able to solve each formulation in this paper to optimality using Solver© in less than one minute on a standard Dell® Lattitude 1.0 GHz laptop. The case scenario is familiar to many students, interesting, and thought-provoking. Furthermore, although the case deals with a baseball-related problem, it is relatively self-contained and requires no understanding of how baseball is played. Thus, it avoids alienation of or the need to accommodate students who are unfamiliar with or disinterested in baseball.

Response of both undergraduate Management Science majors and MBA students to this case has been overwhelmingly positive; the students are very pleased with what they have learned from this experience. Students at both levels are surprised to see operations research and statistics occur in the same case, and they come away from the case with an extremely deep understanding of Simpson's Paradox (one undergraduate Management Science major told me he was hired for his job because he was able to convince his prospective boss that Simpson's Paradox was a real phenomena!). Students are also surprised to learn that i) constraint satisfaction is a legitimate purpose for some problems, ii) relatively small integer programming problems can yield very powerful results, and iii) systematically changing parameters and resolving an integer programming formulation (i.e., sensitivity analysis) can yield tremendous insights. Grades on examinations (particularly the cumulative finals) improved greatly after the author implemented cases such as this into introductory Operations Research courses at both the undergraduate and MBA levels.

Of course, these two disparate groups of students don't respond in a like manner to the various aspects of this case. The undergraduate Management Science majors have much greater interest in and appreciation of the technical nuances of the case, while the MBA students appreciate the case's managerial and decision-making aspects.

The author has also received informal feedback from students after they have graduated and been employed for a few years. They consistently report that the critical thinking and communication skills they honed when working on cases such as 'Bowie Kuhn's Worst Nightmare' give them an enormous advantage in the workplace. This underscores the chief benefit of cases - they provide students an opportunity to apply operations research methods within the context of real problems with authentic, tangible, and discernable consequences.

References

Albert, J. (2002), "A Baseball Statistics Course," Journal of Statistics Education, Vol. 10, No. 2 ,

Albert, J. An Introduction to Sabermetrics,

Albert, J. (2003), Teaching Statistics Using Baseball, Mathematical Association of America.

Albright, C. (1988), "A Statistical Analysis of Hitting Streaks in Baseball," Journal of American Statistical Association, Vol. 88, No. 424.

"Baseball Statistics Glossary," NetShrine,

Cochran, J. (2001), "Strat-O-Matic in the Classroom: Teaching Introductory Probability and Statistics with a Popular Baseball Board Game," Joint Statistical Meetings, Atlanta, GA.

Cochran, J. (2002), "Data Management, Exploratory Data Analysis, and Regression Analysis with 1969-2000 Major League Baseball Attendance," Journal of Statistics Education, Vol. 10, No. 2,

Cochran, J. (2000), "Career Records for All Modern Position Players Eligible for the Major League Baseball Hall of Fame," Journal of Statistics Education, Vol. 8, No. 2,

Cochran, J. (2000), "Successful Use of Cases in Introductory Undergraduate Business College Operations Research Courses," The Journal of the Operational Research Society, Vol. 12, No. 51.

Costa, G. and M. Huber, (2003), "Whaddya Mean? You get Credit for Studying Baseball?" technical report.

Gallian, J. (2001) "Statistics and Sports: A Freshman Seminar," ASA Proceedings of the Section on Statistics in Sports.

Harris, C. and K. Arth, (2003), The Collective Bargaining Agreement for Fans,

InfoPlease, (2001),

Lackritz, J. (1981), "The use of Sports Data in the Teaching of Statistics," ASA Proceedings of the Section on Statistical Education.

Lahman, S. (1996), "A Brief History of Baseball: Part III: Labor Battles in the Modern Era," The Baseball Archive

Ladany, S. and R. Machol, (1977), eds. Optimal Strategies in Sports, North-Holland Publishing Company.

Lindsey, G. (1959), "Statistical Data Useful for the Operation of a Baseball Team," Operations Research, Vol. 7, No. 3.

Lindsey, G. (1961), "The Progress of a Score During a Baseball Game," The Journal of the American Statistical Association, Vol. 56, No. 295.

Lindsey, G. (1963), "An Investigation of Strategies in Baseball," Operations Research, Vol. 11, No. 4.

Lock, R. (1997), "NFL Scores and Pointspreads," Journal of Statistics Education, Vol. 5, No. 3,

McKenzie, J. (1996), "Teaching Applied Statistics Courses with a Sports Theme," ASA Proceedings of the Section on Statistics in Sports.

Nettleton, D. (1998), "Investigating Home Court Advantage," Journal of Statistics Education, Vol. 6, No. 2,

Pappas, D. (2003), "A Contentious History: Baseball's Labor Fights," Baseball Prospectus,

Pappas, D. (1988), "Thirty Years of Collective Bargaining Agreements," SABR 28,

Quinn, R. (1997), "Investigating Probability with the NBA Draft Lottery," Teaching Statistics.

Quinn, R. (1997), "Anomalous Sports Performances," Teaching Statistics.

Reichler, J. (2001) The Baseball Encyclopedia, New York: MacMillan Publishing Company.

Reiter, J. (2001), "Motivating Students' Interest in Statistics through Sports," ASA Proceedings of the Section on Statistics in Sports.

Ross, K. (2004), A Mathematician at the Ballpark, Pi Press.

Simonoff, J. (1998), "Move Over, Roger Maris: Breaking Baseball's Most Famous Record," Journal of Statistics Education, Vol. 6, No. 3,

Starr, N. (1997), "Nonrandom Risk: The 1970 Draft Lottery," Journal of Statistics Education, Vol. 5, No. 2,

Thorn, J., and P. Palmer (2001), Total Baseball, New York: HarperCollins Publishers.

Watnik, M. and R. Levine (2001), "NFL Y2K PCA," Journal of Statistics Education, Vol. 9, No. 3,

Watnik, M. (1998), "Pay for Play: Are Baseball Salaries Based on Performance?" Journal of Statistics Education, Vol. 6, No. 2,

Wiseman, F. and S. Chatterjee (1997), "Major League Baseball Player Salaries: Bringing Realism into Introductory Statistics Courses," The American Statistician, Vol. 51, No. 4.


To download a printable version (pdf) of this paper, click here. To download the Adobe Acrobat reader for viewing and printing pdf files, click here.
To reference this paper, please use: 
Cochran J. (2004), "Bowie Kuhn's Worst Nightmare," INFORMS Transactions on Education, Vol. 5, No 1,  http://ite.pubs.informs.org/Vol5No1/Cochran/

 

Appendix

Bowie Kuhn's Worst Nightmare

Despite the mediation efforts of Major League Baseball (MLB) Commissioner Bowie Kuhn (whose primary duty is to maintain the integrity of the sport and industry of baseball), the Major League Baseball Players Association (MLBPA) and franchise owners were unable to finalize a new collective bargaining agreement prior to the 1981 Major League Baseball season. At this point in MLB history, several rules governed the movement of players from one team to another between seasons. A player who had played in the major leagues for fewer than six complete years was allowed to 'negotiate' a contract only with his current team − he had no freedom to look for a better offer from any other major league club. If a player had played at least six full seasons (or the equivalent if he had played in parts of several seasons at the major league level) and not been a free agent (a player free to negotiate and sign a contract to play with any existing major league baseball team) within the past five years, he was permitted to become a free agent upon expiration of his current contact. Franchises would go through a process (or draft) by which they could claim (in reverse order of their win-loss record from the previous season) nonexclusive rights to negotiate with a free agent, and no player could be drafted by (or negotiate with) more than twelve clubs in addition to his current employer (i.e., the team with which he finished the previous season). A team that signed a free agent had to compensate the club losing him with its first- or second-round pick in the next amateur draft (an annual procedure through which major league baseball teams allocated promising high school and college players among themselves). The free agent draft and the compensation were viewed as necessary by the franchise owners to i) slow the movement of players from franchises in weak financial conditions to those in strong financial conditions (to preserve the competitiveness of each franchise) and ii) provide reasonable reimbursement to teams that had invested in the development of a player and eventually lost him as a free agent. Finally, any players who was ineligible for free agency but had at least three years of major league service did have the right to take his current team to binding arbitration if he felt their contract negotiations had reached an impasse.

This arrangement, which had been in place since 1976, represented a tremendous improvement in the players' negotiating position. Prior to the 1976 collective bargaining agreement (whose expiration led to these negotiations), a player was bound to his original team for life − he was not allowed to negotiate a contract with any other team (unless his current team released him or traded him to another team). This occurred chiefly because of the inclusion of the reserve clause - which gave a team the right to renew a player's contract for one year at its expiration - in the contract of every major league player. The owners interpreted this clause as effective in perpetuity - when a player's contract expired, his team could simply renew the contract under the reserve clause, thereby generating a new 'contract' that again included the reserve clause; when this new 'contract expired after one year, the team could simply invoke the reserve clause again, restarting the cycle (the team technically did not even have to persuade the player to actually sign his renewed contract under these circumstances). Under this system, the only leverage held by a player was to 'hold out' or refusal to play (which, for most players, meant loss of their primary source of income). The owners historically paid players only enough to make a major league baseball career marginally more lucrative than most player's other career options. The players were unable to get out from under this interpretation of the reserve clause until two players (Dave McNally of the Baltimore Orioles and Andy Messersmith of the Los Angeles Dodgers) played the entire 1975 season without signing contracts that had been renewed after the 1974 season under the reserve clause; after the 1975 season arbitrator Peter Seitz upheld each player's claim to be free of their existing teams on the basis of having satisfied the reserve clause, effectively ruling that the reserve clause could not be repeatedly invoked for a single contract.

After the 1976 collective bargaining agreement between the owners and MLBPA expired, the issue of how teams that lost players to free agency would be compensated was the primary matter of contention in the negotiations for a new agreement. In fact, the MLBPA threatened to strike in May of 1980. This potential disruption in the season was narrowly averted when, hours before the strike deadline, the parties announced a four-year deal which resolved all issues except the main point of contention - compensation for the loss of free agents. The owners and MLBPA created a joint committee to study the issue of compensation for free-agent signings and agreed that if the parties couldn't agree on a compensation formula by February 1981, the owners had the right to implement their own proposal. They also agreed that if the owners unilaterally implemented their proposal, the players had the right to strike. The MLBPA agreed to this arrangement in exchange for a reduction in the minimum years of major league service to become eligible for binding arbitration from three to two years.

The MLBPA and owners negotiated unsuccessfully throughout the winter of 1980-81, and 1981 MLB season began without a resolution to the free agency compensation issue. Throughout the negotiations, the owners steadfastly held to one demand - they wanted to allow a team that had lost a free agent to select one player from the roster of the team that had signed the free agent (after permitting the team signing a free agent to name fifteen players from its roster that could not be selected). After fighting for years to obtain the right to free agency, the players considered this to be a crippling penalty that would almost entirely stifle the movement of free agents, and they were unwilling to accept this serious erosion of their progress.

As the season progressed, the players felt they were losing their leverage and that the owners (who the MLBPA felt were content to continue under the old agreement) were stalling in an attempt to finish the current season under the old agreement. Indeed, the owners had taken out a strike insurance policy with Lloyds of London to cover their loss of operating income during the strike, thus greatly reducing their incentive to settle the dispute quickly.

The MLBPA eventually set and announced a strike date. Because MLB had never experienced a mid-season work stoppage, the owners considered the MLBPA threat to be disingenuous. After a week of feverish negotiations, the Marvin Miller (the Executive Director of MLBPA and their chief negotiator) made good on the players' threat and called a strike for June 12.

The divisional standings on June 12, 1981 (the first day of the strike) looked like this:

Table 1. Pre-strike Records and Divisional Standings
Table 1: Pre-strike Records and Divisional Standings

On July 31 (at the time they had essentially exhausted their strike insurance policy benefits), the owners capitulated and agreed to a compromise; the issue of compensation for the loss of free agents was settled by creating a player pool from which teams losing top free agents could pick. This player pool consisted of all players that remained after each team protected twenty four of their players. Free agents were divided by a formula agreed upon by the owners and the MLBPA into Type A (the top 20% of players at their position), Type B (the top 21%-30% of players at their position), and all other players. A team losing a Type A free agent received a pick from the player pool plus a draft pick, a team losing a Type B free agent received two draft picks, and a team losing any other free agent received one draft pick. The collective bargaining agreement also stipulated that up to five teams could avoid supplying players to the pool by agreeing not to sign Type A free agents for three years. The owners and the MLBPA agreed to extend this agreement through 1984.

The All-Star game (which was scheduled to be played during the strike and so had been cancelled) was rescheduled for August 9, and the regular season scheduled to resume on August 10; fifty days and seven hundred and twelve regular season games were lost, effectively eliminating approximately one-third of the 1981 regular season. Rescheduling such a substantial portion of the season was impractical, so MLB had to devise a strategy for dealing with the remainder of the regular season and the playoffs that provided incentive to all teams in the post-strike period and also recognized (reward or punish) teams for their pre-strike performances.

Baseball Commissioner Bowie Kuhn, several owners, and Marvin Miller concocted the following resolution: The regular season would resume as scheduled and no games scheduled during the strike would be rescheduled or made up. The team with the most pre-strike wins in each division was declared the 'first-half' division winner, while the team with the most post-strike wins in each division was declared the 'second-half' division winner. The 'first-half' and 'second-half' winners in each division then met in a three game 'division playoff' to determine the ultimate division winners. If the same team happened to win its division in both halves of the season, then its playoff opponent would be the team that finished second in the 'second half' of the season. The number of games remaining on each team's schedule at the conclusion of the strike was:

Table 2. Post-strike Games Remaining
Table 2: Post-strike Games Remaining

As Commissioner Kuhn and his staff reviewed this solution and these circumstances as the 1981 baseball season resumed after the strike, they had reason to worry about how fans would accept the split-season solution. MLB's credibility had already been seriously damaged by the strike - if fans questioned the fairness and legitimacy of the split-season solution, the loss of perceived integrity that MLB would suffer immediately could be ruinous. Specifically, Commissioner Kuhn and his staff had be concerned over two questions: Is it possible for a team to have the best combined split-season win/loss record in its division but still not qualify for the playoffs? Is it possible for a team to win its division in both portions of the split-season but still not have the best combined split-season win/loss record in its division? Clearly, occurrence of either of these events would give fans reason to question the legitimacy of competition.

Commissioner Kuhn had to be concerned about both of these questions with respect to baseball's traditional criteria for comparing win/loss records of two baseball teams A and B - the number of games Team B is behind Team A in the standings, defined as

By this formula, Cincinnati finished ½ game behind Los Angeles in the National League West Division during the pre-strike period:

as shown in TABLE 1.

Although this criterion would still be used to determine the pre-strike and post-strike divisional winners, Commissioner Kuhn recognized that many fans prefer to focus on the teams' win/loss percentages, defined for some Team A as:

During a normal (uninterrupted) season, these two criteria are equivalent because all teams play the same number of games; however, the split-season resulted in an unequal number of pre-strike, post-strike, and total games to be played by teams. Thus, Commissioner Kuhn also had to be concerned about both of these questions with respect to teams' win/loss percentages. Consequently, your job as an Operations Research Analyst for MLB is to answer Commissioner Kuhn's two questions (Is it possible for a team to have the best combined split-season win/loss record in its division but still not qualify for the playoffs? Is it possible for a team to win its division in both portions of the split-season but still not have the best combined split-season win/loss record in its division?) with respect to both criteria for comparing win/loss records of two baseball teams A and B (number of games Team B is behind Team A in the standings and win/loss percentage). If any of these problems are possible, Commissioner Kuhn wants to know if you can design a resolution that will give all teams a competitive incentive after the strike, justly reward or penalize teams for their pre-strike performances, and avoid the pitfalls of the MLB split-season strategy.