Informal Note
September 10, 1989
Minor Revision April, 1997
Arthur Geoffrion
UCLA Anderson School of Management
There are certain beliefs that underlie how I try to behave when I do applied
modeling work. This note endeavors to make some of these beliefs explicit in
the form of "maxims". That most of the maxims are conventional wisdom
does not diminish their importance to successful professional practice (the
same was true of the exhortations in Peters and Waterman
<1982>).
My purpose in writing on this subject is twofold: (1) I need a relatively compact
checklist to remind myself what I should be doing when I work on applications;
and (2) I want to share with my students and selected colleagues some of what
I have learned about applied work. Consequently, this is not a "research"
paper and is far from being suitable for publication.
Each maxim bears a number for the sake of easy reference. Some maxims within these groups comprise multiple submaxims, numbered accordingly. Some discussion aimed at justification and identifying further reading is given for each maxim. The complete list is as follows:
#1) Reserve
the Right of Problem Formulation
#2) Develop
a Clear Charter and Project Plan
#3) Find a
High Level Champion
#4) Establish
Personal Credibility Early
#5) Produce
Results Early
#6) Involve
the Sponsor and Future Users at All Stages
#7) Communicate
Often and Well
#8) Keep
It Simple
#9) Be Creative
About Alternatives
#10) Separate
General Structure and Instantiating Data
#11) Separate
Model, Problem/Task, and Solver
#12) Model
the Data Development Phase
#13) Know
Your Data
#14) Compose
From Submodels
#15) Choose
Notation Carefully
#15a) Minimize the Need for Manual Language Translation
#15b) Use Mnemonic Notation
#16) Maximize
Understandability
#16a) Exploit Parallel Structure
#16b) Use Modules
#16c) Organize Things Hierarchically
#16d) Document Thoroughly
#16e) Avoid Forward References
#16f) Hide Inessential Detail
#16g) Draw Pictures
#17) Plan
for Change
#17a) Document Interdependencies
#17b) Use Stepwise Refinement
#17c) Avoid Structural and Dimensional Arthritis
#18) Match
Model Realism with Solver Capability
#19) Work
Within Established Standards
#20) Prevent
Errors Before They Occur
Notice that I have concentrated on leadership matters and on the early stages
of a modeling project. Much more could be said about the more technical aspects
of building, maintaining, and using models for their intended purposes. I do
not claim that the maxims given are in any sense complete. They are not, nor
could any list ever be.
I believe that these maxims will contribute greatly to the success of modeling
professionals if followed faithfully. Many are part of the common wisdom
and will strike most readers as obvious. Others may be less obvious and
perhaps even seldom observed. All are the result of experience and reflection.
However, I readily confess that I have shamelessly stolen many of these
maxims from others, and also that I lack the fortitude to observe all of
them all of the time.
I sincerely invite readers to take issue with the views presented here,
to offer new supporting arguments and counter arguments, and to suggest
additional or alternative maxims together with their rationale.
The maxims in this section concern how one conducts one's self relative
to others during the course of modeling work, and how one manages a modeling
project. They may seem to have little to do with technical expertise, but
they could not be followed effectively without excellent technical capabilities.
#1) RESERVE THE RIGHT OF PROBLEM FORMULATION
It is a myth to think that the client of a modeling study hands the problem
formulation down from on high. Most executives and decision-makers in need
of modeling help do quite a poor job of that, so the leaders of a modeling
effort should endeavor to keep early interaction with the client to general
objectives and background.
Section 10.4 of Miser and Quade <1985> contains
an insightful discussion of this point:
One might conclude that the analyst should try to get the manager to sharpen his problem statement. However, experience tells us overwhelmingly that this is the opposite of what is desirable at the beginning: the analyst is well advised to keep the manager's appreciation of his problem as broad and general as possible, so that the analyst making early inquiries into the situation is free to formulate the problem (if indeed this is possible) without the inhibiting constraint of an authoritative misperception. In fact, in my experience, perhaps the worst thing that can happen is for the executive to write a memorandum stating what the problem is, particularly if he is a very strong and dominating personality; his statement then becomes a major deterrent to developing the realistic problem appreciation needed for good analysis, and makes it doubly hard to get this appreciation accepted. The moral is plain: At the beginning, keep the discussions and interactions as broad and flexible as possible, to the end that the early fact finding and analysis can dominate how the problem is formulated.
#2) DEVELOP A CLEAR CHARTER AND PROJECT PLAN
Every model-based project needs a clear statement of objectives and how
the project will go about achieving them. Moreover, this statement must
be mutually acceptable to management and to the project leaders. It serves
as both charter and compass for the voyage.
How can anyone hit a target if they don't know what it is? Moreover, since
there is often more than one person exercising independent initiative,
the crossfire can be dangerous when different people have different targets.
This is known as "working at cross-purposes".
Miser and Quade <1985> has a good discussion
in Section 10.5. See also Hammond <1974>.
One of the maxims in Morris <1967> is: Establish
a clear statement of the deductive objectives.
#3) FIND A HIGH LEVEL CHAMPION
Many of the successful projects I know of were so because of the faithful
patronage of a senior manager. Such patronage is necessary to weather the
many storms -- budgetary, organizational, and political -- to which most
significant projects are subject, especially those that cannot be accomplished
within a short time (a couple of months, say).
Hammond <1974> comments on the importance
of a champion (p. 114).
#4) ESTABLISH PERSONAL CREDIBILITY EARLY
Credibility means that your peers and superiors recognize that you are competent,
have integrity, and are able to make realistic estimates of the time and resources
required to accomplish something. In other words, you are trusted and
respected.
Your effectiveness in any significant role, whether of a leadership or subordinate
nature, is seriously and often fatally undermined without the ability to establish
trust and respect.
#5) PRODUCE RESULTS EARLY
Most sponsors of modeling work have a short-term outlook. They need to
see results. Modeling professionals must understand this fact of life,
and give priority to getting at least preliminary results as soon as possible.
See page 119 of Hammond <1974>.
This maxim may argue for the use of rapid prototyping techniques.
#6) INVOLVE THE SPONSOR AND FUTURE USERS AT ALL STAGES
Almost every article that reports on successful modeling work or offers
advice for practitioners mentions the importance of heavily involving representatives
of the sponsor and users at all stages (e.g., Hammond
<1974>). The reason has to do partly with psychology, partly
with the genuine need for such input, and partly with the proper role of
modeling as a complement and supplement to reasoned judgement rather than
as a replacement for it.
One way to approach this maxim is to ask yourself, very early in a project,
how you can get the sponsors and users to "own" the envisioned
system and have most of its features be "their" idea. If you
do this, you will be forced to involve sponsors and users and to rein the
rampant technocentrism that might otherwise undermine even projects of
outstanding technical excellence.
The benefits of having at least one member of the client organization participate
throughout are explained on page 296 of Miser and
Quade <1985>.
#7) COMMUNICATE OFTEN AND WELL
It would be difficult to overestimate the value of effective communication
between the project team and the sponsor and users, at all stages of the
project.
Speak in the language of the sponsors and users. Relate all communications
to their needs; applied model-based work is done to help them, not
you.
Speak in terms of WHAT and WHY, not HOW. For example, speak in terms of
inputs and outputs in preference to the technology of what happens in between.
But -- and this is very important -- be prepared to help sponsors and users
gain a good understanding of why the modeling results are what they
are (one of my sermons on this, and a possible approach, is the subject
of Geoffrion <1976>).
Sections 10.11 and 10.13 of Miser and Quade <1985>
contain a cogent discussion of the importance of communication and how
to do it effectively.
How one conceptualizes that which is to be modeled, and then designs
a model, set the stage for actually building and using a model. A good
conceptualization and design does not guarantee success later on, but a
bad conceptualization and design all but guarantee failure.
Most of the maxims given in what follows should be kept in mind not only
during the conceptualization and design phase of modeling work, but also
during the phases in which models are built, used, and maintained.
#8) KEEP IT SIMPLE
A useful conceptualization and model design ignores inessential details
about the thing being modeled. There will not be excessive detail that
obscures attending to what is really important. Moreover, complexity causes
technical difficulties such as inefficiency and unmaintainability. It can
easily cause a kind of paralysis that scuttles all chances of success.
Section 10.7 of Miser and Quade <1985> gives
a cogent quote from H. Raiffa urging simplicity. See also page 117 of Hammond
<1974>.
Here is a nice quote from C.S. Holling appearing in Miser
<1989>:
Any model is a caricature of reality. A caricature achieves its effectiveness by leaving out all but the essential; the model achieves its utility by ignoring irrelevant detail. There is always some level of detail that an effective model will not seek to predict, just as there are aspects of realism that no forceful caricature would attempt to depict. Selective focus on the essentials is the key to good modeling.
Excessive simplicity is equally devastating, of course, but the kinds
of people who undertake model-based work seem genetically inclined far
more toward excessive complexity than toward excessive simplicity.
#9) BE CREATIVE ABOUT ALTERNATIVES
Even the most sophisticated optimizers and other model manipulation technologies
that will be available a millennium from now will not be able to arrive
at creative solutions and recommendations if the alternatives with which
they must work are mundane.
There is nothing special about the training of OR/MS professionals which enhances
their ability to be creative in devising alternatives for analysis. This trait
must be cultivated throughout one's professional life.
See Section 10.7 of Miser and Quade <1985>
for a good discussion of this point. They point out that truly creative
alternatives sometimes get analysts into trouble with their clients, and
that is one of the professional risks with which modeling professionals
must live. I would add that, if you aren't ruffling a few feathers now
and then, you probably aren't being creative enough in coming up with imaginative
alternatives.
#10) SEPARATE GENERAL STRUCTURE AND INSTANTIATING
DATA
General structure refers to a class of model instances that
captures virtually all of the instances of possible interest, and that
is convenient to work with. Each model instance can be represented as a
particular general structure together with particular instantiating
data.
Since general structure is dimension-independent and data- independent,
it is less volatile and more compact than specific model instances. This
implies that it is well-suited to such uses as mathematical analysis, auditing,
verification, reuse, and being relatively easily understood and communicated.
The main ideas in this paragraph and the preceding one are discussed in
more detail in Geoffrion <1992>. See
also "desired feature (e)" in Section 1.2 of Geoffrion
<1987>.
The idea that reusability is an important property of general structure is the
subject of another paper of mine that also recognizes the relevance of analogous
ideas in software engineering (Geoffrion <1989>).
A closely related idea is modeling by analogy. For example, Morris
<1967> asserts that analogy or association with previously well
developed logical structures plays an important role in modeling. This is
really the "reuse" idea. Later he states the maxim: Seek analogies.
If thinking involves abstraction and abstraction is a major aim
of general structure, then general structure derives some of its importance
from the importance of thinking.
If general structure and instantiating data can be separated sufficiently,
then it may be possible to use a modern database system to manage the data.
This point has been recognized in recent years by a number of authors including
Bürger <1982>, who presents a design
for an LP system, MLD, that is one of the best-integrated in terms of an
algebraic modeling language together with a supporting relational database
system for data manipulation. Here is a pertinent quote:
In MLD, model solutions are obtained by binding a data module to a model module and executing it. This mechanism makes it easy to use the same data with different models, or to solve a model with different data. The binding between a class of models and a class of data modules is defined by a so-called execution module. The concept of an execution module has some interesting implications for an applied environment: model and execution module can be prepared by an 'expert', while data preparation, model execution, and analysis of the model results then can be carried out by a 'client'.
To put one of Bürger's points a bit differently, one could say
that general structure and instantiating data are developed and used for
different purposes, often by people with vastly different backgrounds;
to confound them is to inhibit efficient divisions of labor and to promote
confusion.
Part of the concept of separating structure and data is the idea that the
latter should be represented as a minimally redundant extension of the
former. The reason is simply that it is inefficient to repeat general structure
as part of the representation of instantiating data. Moreover, redundancy
opens the door to inconsistency.
Separating logic and data preserves the integrity of models and datafiles
and simplifies application development, maintenance and consolidation processes.
(1986 sales brochure for IFPS/PERSONAL).
An analogous distinction (database schema vs. actual data) has been important
in the field of database management since the early 1970's, and has been
argued persuasively in that context (e.g., Date <1982>).
An analogous distinction also is important in the field of computer programming.
The guest editor of a special issue of Computer says (Hong
<1986>): In early programs, data and code were often intermixed.
Modern-day programs generally have clearer separation between data and program
constructs. This same point is a strong theme in Davis
<1986>.
#11) SEPARATE MODEL, PROBLEM/TASK, AND SOLVER
I discussed these distinctions in Section 2.3 (see also desirable feature
(b) of Section 1.2) of Geoffrion<1987>.
Basically: (a) a model is a representation of some aspects of reality,
(b) a problem or task describes something to be done with
a model, and (c) a solver is a manipulator of a model according
to some definite procedure for solving a problem or performing a task.
Perhaps the most obvious reason for the importance of the model/solver
distinction is that it encourages the same model to be used with different
solvers (perhaps to solve different problems or to carry out different
tasks), and it encourages the same solver to be used with different models.
Such reuse can save a lot of time and resources.
Another obvious reason for distinguishing between a model, a problem or task,
and a solver is the pursuit of conceptual clarity. Non-specialists can easily
become confused otherwise. For example, every consultant has had a client ask
"Can you handle such-and-such a feature?". The true answer is often
Yes and No: "Yes" in that you could include the feature in the model,
but "No" in that an otherwise excellent solver would become inapplicable
or inefficient. This answer will not be understood unless the client knows the
conceptual difference between a model and a solver, and knows that a great variety
of problems and tasks lies between the two. For another example, a user who
is presented with an "LP model" may fail to understand that its database
can be used for ad hoc retrieval, that the objective function and a constraint
can switch roles in order to drive on a different criterion, or that the model
can be used for static simulation on a casewise basis. It would be better for
the user to be presented with a "model" on which many problems and
tasks may be posed, some of which may include optimization performed with the
help of optimizing solvers.
A more subtle reason for distinguishing between model and solver is that
not doing so inevitably leads to predicating model design on a particular
solver's limitations. This inhibits keeping track of modeling decisions
that might warrant reversal if and when a more capable solver becomes available.
This maxim runs contrary to much common practice. For example, practitioners
often state an optimization problem simultaneously with the model itself
(e.g., "Maximize f(x) over x subject to g(x) = b, where x represents...").
It is a simple matter to describe the model itself first, and then to pose
an optimization problem on the model.
... the equations used in stating the model should be kept separate from the method devised to solve them. (page 8 of Westerberg <1985>).
Analogous distinctions are important in the neighboring field of knowledge-based systems, where the knowledge base is analogous to the model and the inference engine is analogous to the solver. Speaking about such systems for the lay reader, Davis <1986> said
One of the distinguishing characteristics of these systems is the sharp distinction between the inference engine and the knowledge base. This division has two interesting consequences. First, it makes possible the substitution of a new knowledge base for a new task in place of the existing knowledge base, producing a new system as a result ... [using] the same inference engine. ... Second, it encourages taking an additional step along what has been called the 'how to what' spectrum ... encouraging the construction of a knowledge base containing what the program should know, rather than what it should do. ... The most important consequence of this distinction is that it enables the system to make multiple different uses of the same knowledge, facilitating explanation, knowledge acquisition, and tutoring. The separation of inference engine and knowledge base, and resulting multiple use of the same knowledge, is thus at the root of an expanded view suggesting that a program may do considerably more than compute an answer.
#12) MODEL THE DATA DEVELOPMENT PHASE
I sometimes make essentially this point in my classes by recommending that
"upstream" calculations should be made fully explicit in spreadsheets
and other modeling media.
From the point of view of the solver, especially if the solver is an optimizer,
most of the data processing and computations done as part of the data development
phase can be ignored once the final distilled model is made ready for the
solver. The solver has no need to know of such preliminaries. Unfortunately,
this actually happens all too often in the rush of real applications.
The point of this maxim is to recommend that this preparatory work be formalized
to an appropriate degree, rather than being viewed as an annoying but necessary
evil -- grubby work that can be forgotten as soon as it is done. The reason
is simple: the truly essential nature of this much maligned work should
earn it just as much attention, understanding, status, and documentation
as are accorded the more glamorous analytical work that it makes possible.
Thus it is necessary to formally incorporate data development logic into
"the model".
Inadequate documentation of the data development process is a chronic complaint
of modeling professionals. Modeling data development helps to assure its
documentation, at least in a technical sense.
My intention here is to argue not only that data development should be
modeled, but also that it should be modeled in the same language (or by
the same formalism) used for the main part of the model at hand. For example,
avoid representing a linear programming model in an algebraic modeling
language at the same time as its source data are represented in a spreadsheet.
Pick one formalism or the other (or one that is flexible enough to handle
both parts with ease). The reason is that using two formalisms rather than
one adds to both real and apparent complexity, and contributes to the Tower
of Babel effect that plagues the modeling community.
If model specifications are executable, then modeling data development
can automate some of the work that would have to be done in some other
way (this is especially important when data development steps need to be
executed multiple times).
#13) KNOW YOUR DATA
Every statistician knows that knowing the data well is essential in order to
draw any meaningful conclusions. But, surprisingly, many people trained in OR/MS
don't know this, at least until they have participated in a few projects. The
reason may be that OR/MS people tend to be oriented toward mathematical manipulation
of models, which nearly always assumes that the data are credible. Of course,
they may not be.
Miser and Quade <1985> make this point forcibly:
... the central lesson of my experience: that what one knows about the supporting evidence will play a very large role in how the findings of the analysis are interpreted. This point argues against using data already gathered unless absolutely necessary, and certainly against using them without knowing how they were gathered and -- equally important -- how they were processed. In many cases systems analysts cannot avoid using data gathered elsewhere for other purposes ... but considerable effort should be devoted to learning how these data were developed, and what their strengths and weaknesses are, so that the findings of the analysis can take account of such knowledge. Perhaps one of the most important pitfalls of analysis is to put more credence in data than is warranted by the way they were developed.
#14) COMPOSE FROM SUBMODELS
Some people believe that composition (assembly) from tested submodels is the
only way to successfully build a complex model. See, for example, Hogan
and Weyant <1983>. See also pages 6 and 8 of Westerberg
<1985>: ... to create really large models, one must assume that
most of the model will already exist in the form of submodels stored within
a library. ... The model should be created in parts, each of which can be tested.
Once tested, the parts can become part of a larger model with some assurance
that at least the parts are well formulated. ... Thus any language MUST support
the development of models built up of previously created models.
Ideally, one wants to compose models from submodels in a formal rather
than ad hoc way, within a unifying framework. Probably this requires that
the component models all be expressed in a common formalism; if not originally,
then in the course of composition. A modest literature is growing up around
this approach, which usually is known as "model integration".
One of the goals of this literature is to automate integration.
Analogous ideas often are discussed in software engineering. See, for example,
the discussion in Cox <1986> on "Software
ICs".
#15) CHOOSE NOTATION CAREFULLY
Notation can be immensely powerful if well designed. It can keep the user
out of trouble, focus attention on what is important, be compact, and almost
have an intelligence of its own. Careful attention to notation and representational
devices in general is one of the hallmarks of good modeling.
#15a) MINIMIZE THE NEED FOR MANUAL LANGUAGE TRANSLATION
The three most common language types for representing models are plain English,
mathematics, and computerese. All three are used in a typical model-based project,
and often other types of languages are sometimes used as well. Each serves a
useful purpose, but such multiple languages do have disadvantages, including:
a) Some people who play vital roles in model-based work will resist spending
the time needed to learn a new language that they don't already know, especially
if already well established in their careers. This makes communicating with
such people difficult.
b) It can take a lot of skilled time to translate back and forth among languages.
c) Multiple languages are redundant, which makes them susceptible to all of
the diseases of redundancy, including inefficiency and the possibility of
inconsistency when changes are necessary. See Fourer
<1983> for an excellent discussion of such disadvantages in the
context of LP optimization.
The disadvantages of multiple languages are mitigated if translations among
them can be accomplished automatically. For example, an executable modeling
language permits automatic translation into computerese. It may also permit
automatic translation into stylized but plain English model documentation, and
into mathematical notation of a relatively traditional sort.
#15b) USE MNEMONIC NOTATION
Choosing mnemonic names for the objects that arise in modeling work -- especially
mathematical objects -- facilitates communication with others and enhances maintainability.
A good mnemonic name is no longer than necessary to be unique and remind the
reader what it stands for.
#16) MAXIMIZE UNDERSTANDABILITY
It is important for models to be designed and documented in such a way
as to facilitate demonstrating, to builders as well as sponsors and users,
that they are "correct" in concept. This is almost the same as
designing and documenting models in such a way as to make them understandable
to all concerned.
The maxims that fall under this rubric contribute to understandability.
#16a) EXPLOIT PARALLEL STRUCTURE
When several things have a lot in common, understanding any one of them
can be nearly the same as understanding them all. This makes it advantageous
to treat them as a group, notationally as well as conceptually.
Indexing structures are the main means by which parallel structure is exploited
notationally. This point is discussed in Section 1 of Geoffrion
<1992>.
#16b) USE MODULES
A good way to deal with something complex is to break it down into its
important parts or modules. If wisely defined, these modules are more easily
understood and manipulated than the whole which comprises them. Thus models
should be designed not as monoliths, but rather as interconnected collections
of modules.
But don't take the modularization process too far, or you will arrive at
an atomic view of the whole that is equally unmanageable. The atomic view
obscures the forest for the trees. Important connections and similarities
are not clear. A model built as a mosaic out of tiny pieces is boring and
incomprehensible to all but the trained intellect able to catch the Gestalt
of it. (Gestalt has been defined as "a structure or configuration
... so integrated as to constitute a functional unit with properties not
derivable from its parts in summation.") Well modularized models don't
depend on Gestalt; the properties of the whole are evident from the sum
of the interconnected modules.
This maxim is related to the Compose From Submodels.
Modularization has come to assume a large role in software engineering,
and some of the reasons for this are equally applicable in the context
of modeling. This point is discussed in Section 1 of Geoffrion
<1989>.
One of the maxims in Morris <1967> is: Factor
the system problem into simpler problems. This is essentially the same as
the concept of modularization. Later he says: The real objective of systems
analysis is not simply to study larger and larger problems, but to find ways
of 'cutting' large problems into small ones, such that the solutions of the
small ones can be combined in some easy way to yield solutions for the large
ones.
#16c) ORGANIZE THINGS HIERARCHICALLY
Hierarchical organization is widely practiced and recognized in many fields
as an effective way to deal with complexity (which, in turn, is the bane
of understanding and communication). Hierarchical organization should be
applied to the conceptual units of any model and its documentation. That
is, use the classical outline form.
For an example of hierarchical organization, one need look no further than
the way Unix or Windows files are organized through directories and subdirectories.
#16d) DOCUMENT THOROUGHLY
Inadequate documentation has been identified again and again as a major
factor contributing to failures of modeling projects and the premature
demise of modeling systems. It impedes understanding, communication, maintainability,
and necessary evolution.
See Gass <1984> and Section 10.9 of Miser
and Quade <1985>.
A readable executable modeling language solves the "documentation
problem" by dissolving the distinction between a model and its documentation.
The model becomes its own documentation, and the documentation becomes
the model.
#16e) AVOID FORWARD REFERENCES
Avoiding forward references when defining a model provides a measure of
protection from circular definitions.
A document with no forward references can be read and understood in a single
pass. There is no need to suffer the inefficiency and distraction of having
to jump ahead into unfamiliar territory in order to obtain needed definitions.
Avoiding forward references tends to increase the sequentiality of a model's
description, and consequently its "simplicity". The virtues of
such simplicity have been argued in the related context of structured programming
(e.g., Dahl, Dijkstra, and Hoare <1972>).
#16f) HIDE INESSENTIAL DETAIL
One of the basic principles of good communication is to avoid confusing
the receiving party with irrelevant detail. Models often contain a great
deal of detail, most of which is irrelevant in the context of any particular
discussion or explanation; thus provision should be made to hide such irrelevant
detail so long as it can be unhidden when necessary.
Tree-oriented editors (outliners) have the ability to hide and unhide subtrees.
This is an excellent means for hiding inessential detail when a model's
description is organized hierarchically (per another maxim).
The ability to hide detail selectively gives the ability to construct different
"views" of a model appropriate to different audiences.
#16g) DRAW PICTURES
Graphical displays facilitate visualizing important aspects of a model.
Advocates of modeling paradigms that have natural graphical displays, such
as decision analysis, graph theory, and network flow, argue strongly that
such displays greatly enhance user understanding.
Some mathematical constructs, such as trees and directed graphs, lend themselves
to graphic displays. To the extent that such displays can be understood by people
without mathematical training, the difficulties of mathematical modeling are
diminished.
Sen <1992> gives additional arguments in favor
of pictures. A particularly interesting one is that a well drawn graph
has associative relationships embedded in the adjacency of its features,
and that this adjacency is not only apparent to viewers, but also can impart
an inherent efficiency advantage to algorithms that process the graph.
#17) PLAN FOR CHANGE
Every applied model must accommodate change. The reasons for change include
progression through the various stages of specification completeness as
data are developed and revised, the evolution of general structure as the
modeler gains experience and as requirements shift, and maintenance as
unsuspected deficiencies are discovered or the modeling environment changes.
Obviously, the flexibility to accommodate such changes is essential to
the viability of the project.
It warrants emphasis that, as noted in the previous paragraph, change is not
something that happens only after a model has been designed and built. It is
a way of life while building a model, too. As Morris <1967>
has stressed, the effective modeler constantly alternates between modification
of the model and confrontation by the data.
The issue raised here is recognized as a major problem in software engineering,
where maintenance costs often dwarf original development costs. Some of
the remedies proposed in that context are applicable also to modeling.
#17a) DOCUMENT INTERDEPENDENCIES
Making changes in a model design raises the possibility of introducing
inconsistencies. Most inconsistencies arise because the ramifications --
"inter dependencies" -- of what was changed were not understood
well enough. Therefore, it is desirable for the interdependencies among
the various components of a model to be visible and explicit in the model
documentation or the model itself. Then when something changes it will
be possible to work out what the change could possibly affect (directly
or indirectly).
Explicitly documented interdependencies make it easy to determine exactly
what parts of a model participate in the support or definition of any particular
model component.
#17b) USE STEPWISE REFINEMENT
Stepwise refinement is a popular and much-used approach to the top-down
development of complex systems in many fields. For example, it has been
used in software engineering at least since the classic paper by Wirth
<1971>. A system built in this way is more apt to be able to
accommodate change than one that wasn't.
The processes of stepwise refinement are close to those of evolutionary
change, so if a model and a modeling environment can support the former,
they very likely can support the latter.
Morris <1967> views "enrichment"
and "elaboration" as a basic modeling skill. These concepts are
close to the notion of stepwise refinement.
#17c) AVOID STRUCTURAL AND DIMENSIONAL ARTHRITIS
The lower one goes in the
modeling tradition
modeling paradigm
model class
model
instance
hierarchy, the more frequently changes must be made during the course of
any given application's life-cycle. Arthritis -- that is, pain associated
with change or movement -- generally becomes worse the higher one goes
in the hierarchy. For example, it is usually easy to change a single number
in a network flow model, a little more difficult to add a new class of
nodes to a network flow model, quite disruptive to switch from the network
flow paradigm to another paradigm like integer linear programming, and
still more disruptive to switch from the operations research modeling tradition
to a completely different tradition.
"Dimensional" arthritis has to do with changing model instances within
a given model class. "Structural" arthritis has to do with changing
model classes within a given modeling paradigm, or even changing paradigms within
a given modeling tradition. How one designs a model and implements it can influence
susceptibility to arthritis, so this provides an important consideration.
The key to avoiding both kinds of arthritis is to use a high-level modeling
language that cleanly separates model structure (the essence of "model
class") from data (the essence of "model instance"). See
the maxim on this.
#18) MATCH MODEL REALISM WITH SOLVER CAPABILITY
Although I have argued elsewhere that models and solvers should be strictly
separated, they must be well matched to one another if a modeling application
is to be successful. After all, what is the point of designing a model
if the end result cannot be solved satisfactorily?
Unfortunately, greater model realism usually means less tractability by available
solution technology. This forces the successful modeler to make a conscious
trade-off between model realism and mathematical/computational tractability.
Morris <1967> argues for consciously making
this trade-off in an iterative way.
Simplicity, the subject of another maxim, is a relevant
consideration.
#19) WORK WITHIN ESTABLISHED STANDARDS
The point of this maxim is to advocate the use of standardized modeling
paradigms and representational styles.
Other things being equal, there are always advantages to reducing the number
of different modeling paradigms or representational styles when modeling:
communication is facilitated because fewer paradigms or styles must be
learned or remembered by those who must work with them, and model integration
is facilitated because more of the things to be integrated are expressed
in a common framework.
#20) PREVENT ERRORS BEFORE THEY OCCUR
As the manufacturing strategists say, "Quality should be built in,
not added on". In the context of modeling, this means that an attempt
should be made at the model design stage to anticipate the kinds of errors
likely to occur during the building of a model. Preventative measures should
be taken whenever possible.
One specific suggestion is that, when specifying a model class of interest,
try to make it as tight as possible. Then any data leading to a model instance
that falls outside of the specified class should be automatically recognizable
as such (e.g., a parameter specified to be a nonnegative integer rather
than simply a real is more resistant to specification error). Moreover,
a tight model class often enables automatic data entry by the modeling
environment, which of course precludes manual entry errors (e.g., specifying
index sets by formula whenever possible -- as discussed at length in Geoffrion
<1992> -- makes manual entry unnecessary).
A second suggestion, made by Morris <1967>
among others, is to look at very simple numerical instances at the model
design stage. This helps to reveal bugs before a full-scale model has been
built. Morris also notes other advantages of this practice. Not only should
the modeler do the looking, but some of the users as well (in keeping with
the maxim to involve the user at all stages).
Many of the other maxims will reduce the likelihood of model-building errors
if followed. For example, the beneficial effects of simplicity,
mnemonic notation, good documentation,
and working within standards are obvious.
See Bisschop <1987> for a paper-length
treatment of this topic.
Bisschop, J. <1987>. "Language Requirements
for A Priori Error Checking and Model Reduction in Large-Scale Programming,"
Proceedings of the NATO Advanced Study Institute on Mathematical Models for
Decision Support (July 27 - August 6, 1987 in Val d'Isere, France).
Bürger, W.F. <1982>. "MLD: A Language
and Data Base for Modeling," Research Report RC 9639, IBM T.J. Watson Research
Center, Yorktown Heights, September 14.
Cox, B.J., Jr. <1986>. Object-Oriented Programming,
Addison-Wesley, Reading, MA.
Dahl, O.J., E.W. Dijkstra, and C.A.R. Hoare <1972>.
Structured Programming, Academic Press, London.
Date, C.J. <1982>. An Introduction to Database
Systems, Addison-Wesley, Reading, MA.
Davis, R. <1986>. "Knowledge-Based Systems,"
Science, 231 (February 28), pp. 957-963.
Fourer, R. <1983>. "Modeling Languages Versus
Matrix Generators for Linear programming," ACM Transactions on Mathematical
Software, 9:2 (June), pp. 143-183.
Gass, S.I. <1984>. "Documenting a Computer-Based
Model," Interfaces, 14:3 (May-June), pp. 84-93.
Geoffrion, A.M. <1976>. "The Purpose
of Mathematical Programming is Insight, Not Numbers," Interfaces,
7:1 (November), pp. 81-92.
Geoffrion, A.M. <1987> "An Introduction
to Structured Modeling," Management Science, 33:5 (May),
pp. 547-588.
Geoffrion, A.M. <1989>. "Reusing Structured
Models via Model Integration," Proceedings of the Twenty-Second Annual
Hawaii International Conference on System Science, Kailua-Kona, Hawaii,
January 3-6, IEEE Computer Society Press. Reprinted in Current Research in
Decision Support Technology, R.W. Blanning and D.R. King (Eds.), IEEE Computer
Society Press, 1992.
Geoffrion, A.M. <1992>. "Indexing in
Modeling Languages for Mathematical Programming," Management Science,
38:3 (March 1992).
Hammond, J. <1974>. "Do's and Don'ts of
Computer Models for Planning," Harvard Business Review, 52:2
(March-April), pp. 110-123.
Hogan, W.W. and J.P. Weyant <1983>. "Methods
and Algorithms for Energy Model Composition: Optimization in a Network of Process
Models," in B. Lev (ed.), Energy Models and Studies, North-Holland,
Amsterdam.
Hong <1986> "Guest Editor's Introduction,"
Computer, 19:7 (July), pp. 12-15.
Miser, H.J. <1989>. "The Craft of Operations
Research," Operations Research, 37:4 (July-August), pp. 669-672.
Miser, H.J. and E.S. Quade <1985>. Handbook
of Systems Analysis, North-Holland, New York.
Morris, W. <1967>. "On the Art of Modeling,"
Management Science, 13:12 (August), pp. B707-B717.
Peters, T.J. and R.H. Waterman, Jr. <1982>. In
Search of Excellence, Harper & Row, New York.
Sen, T. <1992>. "Diagrammatic Knowledge Representation,"
IEEE Transactions on Systems, Man and Cybernetics, 22:4,
pp. 275-291.
Westerberg, A.W., <1985>. "Aids for
Engineering System Model Formulation," Working Paper, Dept. of Chemical
Engineering, University of Wisconsin, Madison, presented as the second of three
Hougen Lectures, February 27.
Wirth, N. <1971>. "Program Development by
Stepwise Refinement," Communications of the ACM, 14:4 (April),
pp. 221-227.