Abstract
Students are presented with a narrative concerning the contentious negotiations
between Major League Baseball (MLB) franchise owners and the Major League Baseball
Players Association (MLBPA) over a new collective bargaining agreement in the early
1980s. Details of the 1981 players' strike, ensuing collective bargaining agreement,
and strategy for resuming the interrupted season and facilitating the playoffs are
provided. Pre-strike win/loss records and number of post-strike scheduled games are
provided for each team. This case gives students great insight into the formulation
of integer programming problems and interpretation of the resulting optimal solutions;
in this case they must formulate constraint satisfaction problems with integer decision
variables to assess the advisability of the MLB plan for resuming the season and
revising the playoff structure to reflect the games lost to the strike (and the potential
for an occurrence of Simpson's Paradox). As such, this case is also unique in its
use of optimization to demonstrate a statistical phenomenon. Furthermore, although
the case deals with a baseball-related problem, it is relatively self-contained and
requires no understanding of how baseball is played, so students who are unfamiliar
with the sport will not be seriously disadvantaged.
Teaching applied quantitative methods (Statistics and Operations Research) can be
daunting - many students are poorly prepared, have little prior knowledge or understanding
of quantitative methods, and are anxious about the required mathematics. Furthermore,
although applied quantitative methods are usually taught within the context of some
area(s) of application, such courses are often taught before the students have a
firm grounding in the chosen contextual area. By working within a context that is
more familiar and interesting to students, an instructor of applied quantitative
methods can increase her/his student's interest in and tolerance for the frustrating
and complex concepts that comprise much of the course. The result is better understanding
and retention by students of course material.
In addition to serving as a copious source of easily obtained and understood data,
sports provides a context that is familiar and interesting to many students. Several
instructors have reported great success using sports examples to motivate and illustrate
various quantitative methods. Lackritz (1981) provides
an early discussion of the use of sports data in the teaching of statistics. Lock
(1997) uses data on point
spreads, odds, and outcomes for five years of NFL regular season and playoff games
to address a variety of statistics issues. Simonoff (1998)
uses the 1998 baseball season and the dramatic competition between Mark McGwire and
Sammy Sosa to break Roger Maris' single-season home run record to illustrate various
types of exploratory analyses. Albright (1988) uses sequences
of at-bats for MLB players to illustrate the types of exploratory and confirmatory
analyses that can be performed to look for streakiness. Nettleton (1998)
uses college basketball game scores and analyzes home-court advantage to illustrate
introductory statistics concepts. MLB Hall-of-Fame election results and measurements
of individual player performances are used by Cochran (2000)
to teach descriptive statistics as well as methods for classification and discrimination.
The strength of National Football League (NFL) teams is assessed using principal
components methodology in a course taught by Watnik and Levine (2001).
Wiseman and Chatterjee (1997), Watnik (1998), and Cochran (2002) use sports economics
examples to motivate students; Wiseman, Chatterjee, and Watnick use MLB player salaries
in their introductory statistics courses, while Cochran uses annual MLB franchise
attendance and team performance measures for project courses in linear models and
econometrics.
Many authors have documented their use of sports examples to motivate the study
of probability. Quinn (1997) and Starr (1997) each use the NBA draft lottery to illustrate
various concepts in probability, while Quinn (1997) also uses anomalous sports performances
to demonstrate probability concepts.
Several instructors teach courses (predominantly in probability and statistics)
primarily or exclusively from a sports context. Cochran (2001) uses sports simulation
board games to introduce a number of concepts in the probability segment of an introductory
statistics course. Reiter (2001) teaches a special three-week course on statistics
and sports. Gallian (2001) utilizes statistical concepts to analyze sports achievements
and strategies in a freshman mathematics and sports seminar offered to liberal arts
students. Albert (2002) describes a special section of an introductory statistics
course where all course material is taught from a baseball perspective. McKenzie
(1996) discusses his efforts to teach an applied statistics courses with a sports
theme. Costa and Huber (2003) describe a one credit hour course that focusing on
sabermetrics (defined by Albert in An Introduction to Sabermetrics as "mathematical
and statistical analysis of baseball records"). Finally, Albert (2003) has published
a sports-oriented textbook to be used in introductory statistics courses, and Ross
(2004) has published a textbook to be used in a freshman seminar similar to the course
taught by Gallian.
Although these examples are all taken from probability and statistics, it is worthy
to note that several papers and books have been published on using operations research
to analyze sports problems, including the series of seminal articles by Lindsey (1959,
1961, 1963). Ladany and Machol (1977) also edited an important early volume of papers
on optimization strategies in sports. Use of such examples in the operations research
classroom has not been as well-documented, but this special issue of INFORMS Transactions
on Education serves as evidence that they do exist.
While sports examples are not a panacea for applied quantitative methods courses,
these authors have found that most students (even those who are not knowledgeable
about sports) enjoy and appreciate sports examples in quantitative methods courses if
the context is carefully explained. Furthermore, anecdotal evidence suggests
that a student who develops an understanding of a quantitative method through a sports
example is relatively well-equipped to apply the technique in other familiar contexts
later in her/his education and career.
The contentious negotiations between MLB franchise owners and the MLBPA over a new
collective bargaining agreement in the early 1980s provide the backdrop for the case
discussed in this paper. These circumstances led to MLB's first in-season work stoppage,
a strike that lasted for fifty days and resulted in the loss of seven hundred and
twelve regular season games. To deal with the loss of approximately one-third of
the 1981 regular season, MLB Commissioner Bowie Kuhn, MLBPA Executive Director Marvin
Miller, and several owners devised the following strategy: The regular season resumed
as scheduled and no games lost to the strike were rescheduled. The team with the
most pre-strike wins in each division was declared the 'first-half' division winner,
while the team with the most post-strike wins in each division was declared the 'second-half'
division winner. The first-half and second-half winners in each division were then
to meet in a three game 'division playoff' to determine the ultimate division champion.
If the same team won a division in both the first and the second halves of the season,
its playoff opponent would be the division's post-strike second place team. This
solution, obviously devised to simultaneously recognize teams for their pre-strike
performances and provide incentive for all teams to be competitive during the post-strike
period, quickly came to be known as the 'split season.' A student analyzing this
case must address the potential pitfalls of the split season solution developed by
Mr. Kuhn, Mr. Miller, and the owners. Specifically, the student must address two
fundamental questions: Given the pre-strike results and number of post-strike games
remaining, is it possible for a team to:
i) win neither the pre- or post-strike division title but have the best overall
(combined pre- and post-strike) win/loss record in its division?
or
ii) win both the pre- and post-strike division titles but have an inferior overall
(combined pre- and post-strike) win/loss record relative to a division rival?
This case has been used successfully in an undergraduate introductory mathematical
programming course offered for business students majoring in management science.
This course is the students' first exposure to mathematical programming and provides
an overview of mathematical programming (linear, integer, goal, and nonlinear programming;
network flow problems; combinatorial optimization) with a strong emphasis on modeling
and formulation. The case has also been used with similar success in an MBA core
introductory operations research course. This case-based course provides a less extensive
overview of mathematical programming (linear, goal, and integer programming, network
flow problems) and some coverage of basic stochastic models (decision trees, queuing,
Markov chains, simulation) with a strong emphasis on modeling and formulation. In
both courses, students are encouraged to discuss the case in groups of 2-3 but are
required to submit individual two-page analyses. Students in both courses have spent
several weeks learning about mathematical programming and using Solver© on
such problems prior to assignment of the case.
As prescribed by Cochran (2000), the case is due in both of these classes immediately
prior to an examination on mathematical programming and is the basis of a class discussion
that is intended to help students prepare for the ensuing examination. Because these
courses and the examinations emphasize modeling and formulation, this case provides
strong support for the course objectives. Although the case presents a challenging
formulation for students at either of these levels, it is not beyond their capability
(particularly when students are encouraged to discuss the issues in small teams as
they work through the case). Encouraging students to discuss the case in small teams
also mitigates the potential difficulty presented by students who have little background
or interest in baseball; the case assignment takes on the flavor of a typical business
assignment in which a team of employees with widely different backgrounds are expected
to work cooperatively to solve the problem. Presenting the assignment to the students
in these terms and devoting a class meeting to discussion of the case prior to the
exam effectively diffuses anxiety among non-sports minded students with regard to
the case. The author's experience suggests that many these students actually enjoy
the opportunity to learn a little about the baseball industry and/or see the sport
from a different perspective. Finally, although the case deals with a baseball-related
problem, it requires no understanding of how baseball is played; terms that are idiosyncratic
to baseball or sports are carefully defined or explained in the introductory paragraphs,
so the case is relatively self-contained - explicitly pointing this out also serves
to allay non-sports minded students' anxieties about this case.
Faculty can also direct students to several websites that are useful for the analysis
of this case. A few potentially useful sites include:
Of course, this is not an exhaustive list. Thousands of web sites are devoted to
various aspects of baseball and could potentially be of interest; this wealth of
readily available information is one of the several interesting aspects of this case.
This case raises two principal issues: Given the pre-strike results and number of
post-strike games remaining, is it possible for a team to:
iii) win neither the pre- or post-strike division title but have the best overall
(combined pre- and post-strike) win/loss record in its division?
or
iv) win both the pre- and post-strike division titles but have an inferior overall
(combined pre- and post-strike) win/loss record relative to a division rival?
The ramifications are clear - if either of these events were to occur, fans would
question the fairness and legitimacy of the split-season solution. If the split-season
solution gave fans reason to question the legitimacy of competition, the loss of
perceived integrity that MLB would suffer immediately after its exceedingly acrimonious
first in-season work stoppage would be devastating for the baseball industry. In
analyzing this case, the student must understand how the win/loss records of baseball
teams are evaluated, i.e., on what criteria are teams ranked. Most students who are
unfamiliar with baseball will immediately focus on teams' win/loss percentages,
defined for some Team A as
However, the common criteria for comparing win/loss records of two baseball teams
A and B is the number of games Team B is behind Team A in the standings. The number
of games Team B is behind Team A in the standings (from this point forward referred
to as games in the standings) is defined as
These definitions are provided in the case; however, they could be removed and students
could be expected to find them on their own (the author has done so when assigning
this case to encourage students to develop some skill in performing background research;
students can find this definition on several websites including NetShrine's "Baseball
Statistics Glossary " ).
Students must be made to understand that these criteria will not always yield equivalent
standings (ranking of teams) unless all teams play an equal number of games (and
that most fans don't understand that this could happen!). Because all franchises
are scheduled to play an equal number of regular season games (162), it is obviously
impossible for a team with a lower winning percentage than another team to be superior
in terms of games in the standings at the conclusion of the season if both
teams complete their schedules. However, the 1981 work stoppage and resulting split
schedule left teams with unequal numbers of games in both the pre- and post-strike
periods (as well as over the entire season). Many students will have difficulty accepting
that it is possible for these two criteria to result in different standings under
any circumstances; the instructor can provide a simple example such as the following
to convince these skeptical students:
| Consider two teams and their win-loss records, Team A with
seven wins and three losses and Team B with four wins and one loss. Team A is ½ game
ahead of Team B… |
| …but Team B (0.800) has a better win/loss percentage than
Team A (0.700). |
This relatively trivial example (in conjunction with the students' initial skepticism)
demonstrates why the potential for such an occurrence (and the public relations debacle
that would certainly ensue) should have been a concern of MLB during the 1981 season
and underscores why it is important to consider both win/loss percentage and games
in the standings when addressing the issues in this case.
Using both win/loss percentage and games in the standings,
we now consider the two questions
i) Can a team win neither the pre- or post-strike division title but have the best
overall (combined pre- and post-strike) win/loss record in its division?
and
ii) Can a team win both the pre- and post-strike division titles but have an inferior
overall (combined pre- and post-strike) win/loss record relative to a division rival?
In modeling these questions, students generally define decision variables similar
to the following:
wi,j is the number of games won by team i in period j, j=1 (pre-strike),
2 (post-strike)
Given these decision variable definitions, other values can easily be defined:
gi,j is the number of games played by team i in period j
li,j = gi,j - wi,j is the number of losses
suffered by team i in period j
If games in the standings is used as the criterion, a pair of constraints
must be constructed to force two different teams (say i=1 and i=2) to finish ahead
of a third team (i=3) in the pre- and post-strike periods, respectively. These constraints
could look like
Similarly, a pair of constraints must be constructed to force teams 1 and 2 (who
had superior pre- and post-strike records, respectively) to each have an inferior
overall (combined pre- and post-strike) win/loss record in terms of games in the
standings relative to team 3. These constraints could look like
The right-hand side for each of these constraints is set to 0.50 because that is
the smallest possible advantage in games in the standings that one team can
have over another team (determining this is another interesting challenge for students − many
have difficulty with this aspect of the case).
Also note the number of wins for team i in period j must be a nonnegative integer
that does not exceed the number games played by team I in period j:
The resulting feasible region is given by
Students generally chose one division on which to initially focus their efforts
- most chose to concentrate on the NL (NL) West Division because it had tightest
pre-strike standings (the Cincinnati Reds finished 0.5 games behind the Los Angeles
Dodgers). Furthermore, it is logical to select the team in the chosen division with
the worst pre-strike record as the team that will (hypothetically) finish with a
better post-strike record than Cincinnati. In this instance, that team is the San
Diego Padres (with twenty-three wins and thirty-three losses). Thus, i=1 represents
the Los Angeles Dodgers, i=2 represents the San Diego Padres, and i=3 represents
the Cincinnati Reds in the National League West Division model.
Once a student selects a division on which to initially focus, the feasible region
can be simplified by recognizing that the constraint forcing team 1 to finish with
a better pre-strike record than team 3 can be eliminated (because the pre-strike
standings are known). The feasible region can be further simplified through substitution
of known values for i) number of pre-strike wins by each team (w1,1=36, w2,1=23, w3,1=35),
ii) number of pre-strike games played by each team (g1,1=57, g2,1=56, g3,1=56),
and iii) number of post-strike games played by each team (g1,2=53, g2,2=54, g3,2=52).
The simplified feasible region is

At this point, the student must recognize that integer programming is actually being
used in this case to determine feasibility and not optimality. Understanding
this point is crucial to the development of an objective function for this formulation;
it provides great latitude in the choice of objective function. One could, for example,
chose to optimize (either maximize or minimize) the number of post-strike games played
or won by any of the three teams.
This formulation is feasible, which implies that Cincinnati (or some other MLB team)
could win neither the pre-or post strike division title and still have the best overall
(combined pre- and post-strike) win/loss records in terms of games in the standings in
their division.
A similar formulation can be used if win/loss percentage is used as
the criterion. Given a sufficiently small value ε, the associated feasible
region (after simplification) is given by
Again, students have great latitude in the choice of objective function. One could
again chose any of the objective functions considered for the previous formulation
(as well as several others). Furthermore, the question of the appropriate size of ε and
the effect of the size of ε on the solution time is interesting in its own
right - this creates another opportunity for a provocative class discussion.
While this question certainly can (and will) be modeled by many students, it is
relatively simple to answer without use of any modeling - a student can logically
determine that this could easily occur (in terms of either win/loss percentage or games
in the standings) if one team finished a close second a different team in each
segment of the split schedule. As a student who does independent reading on the 1981
season will find, this actually happened in both NL divisions:
Table 1. Pre-strike, Post-strike, and Combined win/loss Records 
In terms of games in the standings, the Philadelphia Phillies won the pre-strike
NL East Division title (by 1.5 games over the St. Louis Cardinals) and the Montreal
Expos won the post-strike NL East Division title (by 0.5 games over St. Louis), while
St. Louis had a superior overall record (by 2.0 games over the Montreal Expos). Similarly,
in the NL West Division the Los Angeles Dodgers won the pre-strike division title
(by 0.5 games over the Cincinnati Reds) and the Houston Astros won the post-strike
division title (by 1.5 games over Cincinnati), while Cincinnati had a superior overall
record (by 4.0 games over Los Angeles). Despite having the best record of any MLB
team in 1981, Cincinnati did not qualify for the playoffs! The results were similar
with regards to win/loss percentage.
This question is far more challenging - it is more difficult to answer without modeling
and no examples of this result occurred during the 1981 season. Additionally, students
must recognize that they can use the same variables and values wi,j and gi,j defined
in Section 3.1, but they only need to consider two teams (a team that wins both the
pre-and post-strike division titles and a team that doesn't win either title) to
model this question.
3.2.1 Can a team win both the pre- and post-strike division titles but have an
inferior combined split season win/loss record in terms of games in the standings relative
to a division rival?
When using games in the standings as their criterion, students generally
let i=1 for the team that wins both the pre-and post-strike division titles and i=2
for the team that doesn't win either title but has the superior combined split season
win/loss record relative record, and then develop an objective function of some form
similar to:
This objective function allows for the determination of the existence of a combined
split-season advantage in games in the standings of a team who did not win
the pre-strike division title (i=2) over the team who did win the pre-strike division
title (i=1). Note that the objective function can be simplified for this problem
because the values of wi,1 and gi,j are each
known for all i and j.
In order to force a post-strike advantage in games in the standings for the
team that won the pre-strike division title (i=1) over a team that did not win the
pre-strike division title (i=2), the following constraint is constructed:
The right-hand side of this constraint is again set to 0.50 because that is the
smallest possible advantage in games in the standings that one team can have
over another team.
Finally, the number of wins for team i in the post-strike period must be a nonnegative
integer that does not exceed the number of games played by team i during the post-strike
period. The resulting formulation is
Suppose we formulate this problem for the NL West division (the division with the
closest pre-strike standings) and let i=1 represent the Los Angeles Dodgers (who
won the pre-strike division title) and i=2 represent the Cincinnati Reds (who finished
second in the division in the pre-strike standings). The formulation that results
after substituting known values for i) number of pre-strike wins by each team (w1,1=36, w2,1=35),
ii) number of pre-strike games played by each team (g1,1=57, g2,1=56),
and iii) number of post-strike games played by each team (g1,2=53, g2,2=52)
simplifies to

In this case, the constants in the objective function (values of wi,1 and gi,j)
completely cancel each other. If a constant remained in this objective function after
simplification (possibly when formulating this problem for another division), the
student could chose to retain the constant because its presence enables a quick determination
on whether is it possible for a team to win its division in both portions of the
split season but still not have the best overall (combined pre- and post-strike)
win/loss record in its division in terms of games in the standings (a positive
feasible value for this objective function indicates that such an occurrence is possible).
This problem, which can easily be solved by inspection, yields several alternate
optima of the form
each with an associated objective value of -1.0 (note that w1,2 {1,2,…,
53} so that w2,2 ≥ 0). This result indicates that, given
the pre-strike results and number of post-strike games remaining, it is not possible
for Los Angeles to win its division in both portions of the split season and have
a worse overall (combined pre- and post-strike) win/loss record in terms of games
in the standings relative to Cincinnati.
The optimal objective values for the formulations associated with each of the other
divisions (NL East, American League East and West) are also negative. These results
demonstrate that, given the pre-strike results and number of post-strike games remaining,
it is not possible for any MLB team to win both the pre- and post-strike division
titles but have an inferior overall (combined pre- and post-strike) win/loss record
in terms of games in the standings relative to a division rival.
There is, however, another issue lurking in this case - due to inclement weather,
some scheduled games may be cancelled (rainouts). Thus, an implicit assumption of
the previous model regarding the number of post-strike games remaining may be false.
If this assumption is relaxed, the gi,2 may be considered decision
variables, and the salient question becomes Given the pre-strike results, is it
possible under any circumstances (i.e, any values of gi,2) for a MLB team
to win both the pre- and post-strike division titles but have an inferior overall
(combined pre- and post-strike) win/loss record in terms of games in the standings relative
to a division rival? In addition to constraining the post-strike positive advantage
between the team who did win the pre-strike division title (i=1) and a team who did
not win the pre-strike division title (i=2) to be positive, we must now add the following
constraint

to force the team that did not win either the pre-strike or post strike division
title (i=2) to have a positive overall advantage in games in the standings over
the team who did win both the pre-strike and post strike division title (i=1). We
also must now limit the number of games played by each team in the post-strike period
(gi,2) so they are nonnegative integers that do not exceed the
number of post-strike games scheduled for team i. If we define si,2 to
be the number of post-strike games scheduled for team i, the resulting formulation
is
The simplified formulation of this problem for the NL West (with s1,2=52, s2,2=53)
is
If no student notices, the instructor should point out that the first two constraints
combine to render this formulation infeasible - when the first constraint is multiplied
by -1.0, the left hand side (which is now identical to the left hand side of the
second constraint) is constrained to values no greater than -1.00. Thus, this constraint
and the second constraint cannot be satisfied simultaneously.
The instructor can further elevate this discussion and demonstrate that it is impossible under
any circumstances for a team to win both the pre- and post-strike division
titles but have an inferior overall (combined pre- and post-strike) win/loss record
in terms of games in the standings relative to a division rival. If we define
combined season total wins and losses to be wi,T = wi,1 + wi,2 and li,T = li,1 + li,2,
we have

If i = 2 represents the team with best overall (combined pre- and post-strike) win/loss
record in its division in terms of games in the standings, the left-hand side
of this expression (w1,T − wi,2) + (l2,T − l1,T)
must be negative. However, this can happen only if at least one of the two terms
(w1,1 − w2,1) + (l2,1 − l1,1)
or (w1,2 − w2,2) + (l2,2 − l1,2)
on the right-hand side of this expression is negative, i.e., if the team represented
by i = 2 wins either the pre- or post-strike division title. Here, students are given
insight into the deep perspective that the simple act of modeling a problem can provide.
3.2.2 Can a team have both the best pre- and post-strike win/loss percentages in
its division but still not have the best overall (combined pre- and post-strike) win/loss
percentage in its division?
Given the pre-strike results and number of post-strike games remaining, the formulation
of this problem is
where ε is again a sufficiently small value. If the number of post-strike
games played by each team is fixed (i.e., no cancellations), this is a relatively
simple linear integer programming problem. After substituting known values, the formulation
for the NL West (where i=1 represents the Los Angeles Dodgers and i=2 represents
the Cincinnati Reds) simplifies to

Again, the constant is left in the objective function to aid in interpretation -
a positive feasible value of this objective function indicates it is possible for
a team to have the best pre- and post-strike win/loss percentages in its division
but still not have the best overall (combined pre- and post-strike) win/loss percentage in
its division. The optimal solution of this problem is negative, indicating is it
not possible for Los Angeles to have the best pre- and post-strike win/loss percentages in
its division and also have a worse overall (combined pre- and post-strike) win/loss
percentage than Cincinnati. Thus, Simpson's Paradox cannot occur in the NL West
Division given the pre-strike results and number of post-strike games scheduled.
Instructors can use this result to motivate a discussion on how Simpson's Paradox
occurs when there is a large discrepancy in the number of events observed (games
played) for the various categories (teams).
Finally, students must address the same issue under the condition that some post-strike
games may be cancelled. Although this is a more general potential case of Simpson's
Paradox, most students reflexively believe (incorrectly), given the previous results,
they have already answered this question.
Key to this formulation is again recognizing that the games played by team i in
the post-strike period (gi,2) is now a decision variable that must
be integer and not exceed the number of games scheduled for team i in the post-strike
period (si,2). Note that the values of the gi,2 must
also be positive so the constraint forcing the team that won the pre-strike division
title to have a superior post-strike win/loss percentage is defined. Thus,
the student must augment the previous formulation with the constraint set
After substituting known values and setting ε=
0.0001, the formulation for the NL
West (where i=1 represents the Los Angeles Dodgers and i=2 represents the Cincinnati
Reds) simplifies to
The optimal solution (w1,2=1, w2,2=0, g1,2=53, g2,2=1)
yields a positive objective value (0.27767), indicating that it is possible for a
team to have the best pre-and post-strike win/loss percentages in its division
and have an overall (combined pre- and post-strike) win/loss percentage inferior
to another team in its division if some post-strike games are cancelled. The existence
of Simpson's Paradox has been demonstrated. Again, instructors can note that the
optimal solution occurs where the discrepancy in games played by the two teams (g1,2 and g2,2)
is greatest and use these results to further a discussion on how Simpson's Paradox
can occur when there is a large discrepancy in the number of events observed (games
played) for the various categories (teams).
Upon noting that this is an extreme case (Los Angeles wins one of their fifty-three
post-strike games, while Cincinnati loses their only post-strike game that isn't
cancelled!), a student may arbitrarily choose what s/he consider more reasonable
values for g1,2 and g2,2. Depending on the chosen
values for g1,2 and g2,2, the student may or
may not find other potential occurrences of Simpson's Paradox. These efforts may
eventually lead a student to recognize that this problem can be linearized by iteratively
fixing the number of post-strike games played by each of the teams (gi,2,
i =1,2) and solving the resulting formulation. These results can be used to generate
a contour plot of the number of post-strike games played by each team and the maximum
difference in overall win/loss percentage between the team that does not have
the best pre- or post-strike win/loss percentage in the division and the team
that does best pre- and post-strike win/loss percentages in the division.
Figure 1. Contour Plot of Maximum Difference in Overall win/loss percentage by games played.
This plot would be relatively easy for a student to produce using VBA in conjunction
with Solver©. A student who produces this plot can respond intelligently
to the last question posed by the case (If any of these problems are possible,
can you design a resolution that will give all teams a competitive incentive after
the strike, justly reward or penalize teams for their pre-strike performances, and
avoid the pitfalls of the MLB split-season strategy?). The student could suggest
that MLB management use the plot to avoid an occurrence of Simpson's Paradox if similar
circumstances arose - MLB must ensure the two teams play an appropriate number of
games to put them in the light blue region (center crease) of the surface on the
graph. If the teams in question are in danger of leaving the blue crease of the contour
plot, MLB must reschedule some of the rained-out post-strike games (which is actually
a MLB policy). This plot also provides an important insight into Simpson's Paradox
- the potential for this phenomenon grows with the discrepancy in post-strike games
played by two teams.
Students have also suggested several other creative strategies for avoiding the
pitfalls of MLB's split-season scheme (crown divisional championships for the pre-strike
and combined pre- and post strike periods, crown divisional championships for the
post-strike and combined pre- and post strike periods) and can test each of these
solutions with the same approach they use to test of MLB's split-season scheme. Students
generally do come to the realization that there is no absolutely fair resolution
to this problem (a very important conclusion).
A student could conceivably use other formulations to demonstrate that it is possible
for a team to have the best the pre-and post-strike win/loss percentages in
its division and have an overall (combined pre- and post-strike) win/loss percentage inferior
to another team in its division if some post-strike games are cancelled. Optimization
(minimization or maximization) of the number of post-strike games played (gi2)
or won by either team (wi2) could be used as the objective. In
addition to the constraints from the previous formulation, this formulation requires
a constraint to force the team that did not win its division in either portion of
the split season to have the best overall (combined pre- and post-strike) win/loss
percentage in its division. For example, the following formulation
is feasible only if the team that did not win its division in either portion of
the split season could have the best overall (combined pre- and post-strike) win/loss
percentage in its division.
This case effectively requires the student to use integer programming to determine
the feasibility of numerous anomalous circumstances regarding the 1981 MLB split
season. Proving or disproving the plausibility of these circumstances becomes progressively
more challenging, and the case culminates in proof of the existence of Simpson's
Paradox (an important, counter-intuitive, and frequently misunderstood statistical
concept). Thus, the case can be used to encourage students to integrate operations
research and statistics, thereby helping them understand the common nature of various
quantitative disciplines.
Many potential formulations, all of which are relatively simple, can conceivably
be derived from this case. While most students will not develop formulations that
address all potential issues presented in this case, the author's experience suggests
that an instructor can expect most of the issues to be addressed by at least one
student in a reasonable sized (at least fifteen students) course section. This provides
an excellent forum for discussion of various formulations, their ramifications, and
how they can be used to address the various issues of the case.
The author was able to solve each formulation in this paper to optimality using
Solver© in less than one minute on a standard Dell® Lattitude
1.0 GHz laptop. The case scenario is familiar to many students, interesting, and
thought-provoking. Furthermore, although the case deals with a baseball-related problem,
it is relatively self-contained and requires no understanding of how baseball is
played. Thus, it avoids alienation of or the need to accommodate students who are
unfamiliar with or disinterested in baseball.
Response of both undergraduate Management Science majors and MBA students to this
case has been overwhelmingly positive; the students are very pleased with what they
have learned from this experience. Students at both levels are surprised to see operations
research and statistics occur in the same case, and they come away from the case
with an extremely deep understanding of Simpson's Paradox (one undergraduate Management
Science major told me he was hired for his job because he was able to convince his
prospective boss that Simpson's Paradox was a real phenomena!). Students are also
surprised to learn that i) constraint satisfaction is a legitimate purpose for some
problems, ii) relatively small integer programming problems can yield very powerful
results, and iii) systematically changing parameters and resolving an integer programming
formulation (i.e., sensitivity analysis) can yield tremendous insights. Grades on
examinations (particularly the cumulative finals) improved greatly after the author
implemented cases such as this into introductory Operations Research courses at both
the undergraduate and MBA levels.
Of course, these two disparate groups of students don't respond in a like manner
to the various aspects of this case. The undergraduate Management Science majors
have much greater interest in and appreciation of the technical nuances of the case,
while the MBA students appreciate the case's managerial and decision-making aspects.
The author has also received informal feedback from students after they have graduated
and been employed for a few years. They consistently report that the critical thinking
and communication skills they honed when working on cases such as 'Bowie Kuhn's Worst
Nightmare' give them an enormous advantage in the workplace. This underscores the
chief benefit of cases - they provide students an opportunity to apply operations
research methods within the context of real problems with authentic, tangible, and
discernable consequences.
Albert, J. (2002), "A Baseball Statistics Course," Journal of Statistics
Education, Vol. 10, No. 2 , 
Albert, J. An Introduction to Sabermetrics, 
Albert, J. (2003), Teaching Statistics Using Baseball, Mathematical
Association of America.
Albright, C. (1988), "A Statistical Analysis of Hitting Streaks in Baseball," Journal
of American Statistical Association, Vol. 88, No. 424.
"Baseball Statistics Glossary," NetShrine, 
Cochran, J. (2001), "Strat-O-Matic in the Classroom: Teaching Introductory
Probability and Statistics with a Popular Baseball Board Game," Joint Statistical
Meetings, Atlanta, GA.
Cochran, J. (2002), "Data Management, Exploratory Data Analysis, and
Regression Analysis with 1969-2000 Major League Baseball Attendance," Journal
of Statistics Education, Vol. 10, No. 2, 
Cochran, J. (2000), "Career Records for All Modern Position Players
Eligible for the Major League Baseball Hall of Fame," Journal of Statistics Education, Vol.
8, No. 2, 
Cochran, J. (2000), "Successful Use of Cases in Introductory Undergraduate
Business College Operations Research Courses," The Journal of the Operational
Research Society, Vol. 12, No. 51.
Costa, G. and M. Huber, (2003), "Whaddya Mean? You get Credit for Studying
Baseball?" technical report.
Gallian, J. (2001) "Statistics and Sports: A Freshman Seminar," ASA
Proceedings of the Section on Statistics in Sports.
Harris, C. and K. Arth, (2003), The Collective Bargaining Agreement
for Fans, 
InfoPlease, (2001), 
Lackritz, J. (1981), "The use of Sports Data in the Teaching of Statistics," ASA
Proceedings of the Section on Statistical Education.
Lahman, S. (1996), "A Brief History of Baseball: Part III: Labor Battles
in the Modern Era," The Baseball Archive 
Ladany, S. and R. Machol, (1977), eds. Optimal Strategies in Sports, North-Holland
Publishing Company.
Lindsey, G. (1959), "Statistical Data Useful for the Operation of a
Baseball Team," Operations Research, Vol. 7, No. 3.
Lindsey, G. (1961), "The Progress of a Score During a Baseball Game," The
Journal of the American Statistical Association, Vol. 56, No. 295.
Lindsey, G. (1963), "An Investigation of Strategies in Baseball," Operations
Research, Vol. 11, No. 4.
Lock, R. (1997), "NFL Scores and Pointspreads," Journal of Statistics
Education, Vol. 5, No. 3, 
McKenzie, J. (1996), "Teaching Applied Statistics Courses with a Sports
Theme," ASA Proceedings of the Section on Statistics in Sports.
Nettleton, D. (1998), "Investigating Home
Court Advantage," Journal
of Statistics Education, Vol. 6, No. 2, 
Pappas, D. (2003), "A Contentious History: Baseball's Labor Fights," Baseball
Prospectus, 
Pappas, D. (1988), "Thirty Years of Collective Bargaining Agreements," SABR
28, 
Quinn, R. (1997), "Investigating Probability with the NBA Draft Lottery," Teaching
Statistics.
Quinn, R. (1997), "Anomalous Sports Performances," Teaching Statistics.
Reichler, J. (2001) The Baseball Encyclopedia, New York: MacMillan
Publishing Company.
Reiter, J. (2001), "Motivating Students' Interest in Statistics through
Sports," ASA Proceedings of the Section on Statistics in Sports.
Ross, K. (2004), A Mathematician at the Ballpark, Pi Press.
Simonoff, J. (1998), "Move Over, Roger Maris: Breaking Baseball's Most
Famous Record," Journal of Statistics Education, Vol. 6, No. 3, 
Starr, N. (1997), "Nonrandom Risk: The 1970 Draft Lottery," Journal
of Statistics Education, Vol. 5, No. 2, 
Thorn, J., and P. Palmer (2001), Total Baseball, New York: HarperCollins
Publishers.
Watnik, M. and R. Levine (2001), "NFL Y2K PCA," Journal of Statistics
Education, Vol. 9, No. 3, 
Watnik, M. (1998), "Pay for Play: Are Baseball Salaries Based on Performance?" Journal
of Statistics Education, Vol. 6, No. 2, 
Wiseman, F. and S. Chatterjee (1997), "Major League Baseball Player
Salaries: Bringing Realism into Introductory Statistics Courses," The American
Statistician, Vol. 51, No. 4.
 |
To download a printable version (pdf) of this paper, click here. To download the Adobe Acrobat reader for viewing and printing pdf files, click here. |
 |
To
reference this paper, please use:
Cochran J. (2004), "Bowie Kuhn's Worst Nightmare," INFORMS Transactions on Education, Vol. 5, No 1,
http://ite.pubs.informs.org/Vol5No1/Cochran/
|
|