September 10, 1989
Minor Revision April, 1997
UCLA Anderson School of Management
There are certain beliefs that underlie how I try to behave when I do applied modeling work. This note endeavors to make some of these beliefs explicit in the form of "maxims". That most of the maxims are conventional wisdom does not diminish their importance to successful professional practice (the same was true of the exhortations in Peters and Waterman <1982>).
My purpose in writing on this subject is twofold: (1) I need a relatively compact checklist to remind myself what I should be doing when I work on applications; and (2) I want to share with my students and selected colleagues some of what I have learned about applied work. Consequently, this is not a "research" paper and is far from being suitable for publication.
I. Personal Conduct and Project Management
II. Conceptualizing and Designing Models.
Each maxim bears a number for the sake of easy reference. Some maxims within these groups comprise multiple submaxims, numbered accordingly. Some discussion aimed at justification and identifying further reading is given for each maxim. The complete list is as follows:
the Right of Problem Formulation
#2) Develop a Clear Charter and Project Plan
#3) Find a High Level Champion
#4) Establish Personal Credibility Early
#5) Produce Results Early
#6) Involve the Sponsor and Future Users at All Stages
#7) Communicate Often and Well
#9) Be Creative About Alternatives
#10) Separate General Structure and Instantiating Data
#11) Separate Model, Problem/Task, and Solver
#12) Model the Data Development Phase
#13) Know Your Data
#14) Compose From Submodels
#15) Choose Notation Carefully
#15a) Minimize the Need for Manual Language Translation
#15b) Use Mnemonic Notation
#16) Maximize Understandability
#16a) Exploit Parallel Structure
#16b) Use Modules
#16c) Organize Things Hierarchically
#16d) Document Thoroughly
#16e) Avoid Forward References
#16f) Hide Inessential Detail
#16g) Draw Pictures
#17) Plan for Change
#17a) Document Interdependencies
#17b) Use Stepwise Refinement
#17c) Avoid Structural and Dimensional Arthritis
#18) Match Model Realism with Solver Capability
#19) Work Within Established Standards
#20) Prevent Errors Before They Occur
Notice that I have concentrated on leadership matters and on the early stages
of a modeling project. Much more could be said about the more technical aspects
of building, maintaining, and using models for their intended purposes. I do
not claim that the maxims given are in any sense complete. They are not, nor
could any list ever be.
I believe that these maxims will contribute greatly to the success of modeling professionals if followed faithfully. Many are part of the common wisdom and will strike most readers as obvious. Others may be less obvious and perhaps even seldom observed. All are the result of experience and reflection. However, I readily confess that I have shamelessly stolen many of these maxims from others, and also that I lack the fortitude to observe all of them all of the time.
I sincerely invite readers to take issue with the views presented here, to offer new supporting arguments and counter arguments, and to suggest additional or alternative maxims together with their rationale.
The maxims in this section concern how one conducts one's self relative
to others during the course of modeling work, and how one manages a modeling
project. They may seem to have little to do with technical expertise, but
they could not be followed effectively without excellent technical capabilities.
#1) RESERVE THE RIGHT OF PROBLEM FORMULATION
It is a myth to think that the client of a modeling study hands the problem formulation down from on high. Most executives and decision-makers in need of modeling help do quite a poor job of that, so the leaders of a modeling effort should endeavor to keep early interaction with the client to general objectives and background.
Section 10.4 of Miser and Quade <1985> contains an insightful discussion of this point:
One might conclude that the analyst should try to get the manager to sharpen his problem statement. However, experience tells us overwhelmingly that this is the opposite of what is desirable at the beginning: the analyst is well advised to keep the manager's appreciation of his problem as broad and general as possible, so that the analyst making early inquiries into the situation is free to formulate the problem (if indeed this is possible) without the inhibiting constraint of an authoritative misperception. In fact, in my experience, perhaps the worst thing that can happen is for the executive to write a memorandum stating what the problem is, particularly if he is a very strong and dominating personality; his statement then becomes a major deterrent to developing the realistic problem appreciation needed for good analysis, and makes it doubly hard to get this appreciation accepted. The moral is plain: At the beginning, keep the discussions and interactions as broad and flexible as possible, to the end that the early fact finding and analysis can dominate how the problem is formulated.
#2) DEVELOP A CLEAR CHARTER AND PROJECT PLAN
Every model-based project needs a clear statement of objectives and how the project will go about achieving them. Moreover, this statement must be mutually acceptable to management and to the project leaders. It serves as both charter and compass for the voyage.
How can anyone hit a target if they don't know what it is? Moreover, since there is often more than one person exercising independent initiative, the crossfire can be dangerous when different people have different targets. This is known as "working at cross-purposes".
Miser and Quade <1985> has a good discussion in Section 10.5. See also Hammond <1974>.
One of the maxims in Morris <1967> is: Establish a clear statement of the deductive objectives.
#3) FIND A HIGH LEVEL CHAMPION
Many of the successful projects I know of were so because of the faithful patronage of a senior manager. Such patronage is necessary to weather the many storms -- budgetary, organizational, and political -- to which most significant projects are subject, especially those that cannot be accomplished within a short time (a couple of months, say).
Hammond <1974> comments on the importance of a champion (p. 114).
#4) ESTABLISH PERSONAL CREDIBILITY EARLY
Credibility means that your peers and superiors recognize that you are competent, have integrity, and are able to make realistic estimates of the time and resources required to accomplish something. In other words, you are trusted and respected.
Your effectiveness in any significant role, whether of a leadership or subordinate nature, is seriously and often fatally undermined without the ability to establish trust and respect.
#5) PRODUCE RESULTS EARLY
Most sponsors of modeling work have a short-term outlook. They need to see results. Modeling professionals must understand this fact of life, and give priority to getting at least preliminary results as soon as possible.
See page 119 of Hammond <1974>.
This maxim may argue for the use of rapid prototyping techniques.
#6) INVOLVE THE SPONSOR AND FUTURE USERS AT ALL STAGES
Almost every article that reports on successful modeling work or offers advice for practitioners mentions the importance of heavily involving representatives of the sponsor and users at all stages (e.g., Hammond <1974>). The reason has to do partly with psychology, partly with the genuine need for such input, and partly with the proper role of modeling as a complement and supplement to reasoned judgement rather than as a replacement for it.
One way to approach this maxim is to ask yourself, very early in a project, how you can get the sponsors and users to "own" the envisioned system and have most of its features be "their" idea. If you do this, you will be forced to involve sponsors and users and to rein the rampant technocentrism that might otherwise undermine even projects of outstanding technical excellence.
The benefits of having at least one member of the client organization participate throughout are explained on page 296 of Miser and Quade <1985>.
#7) COMMUNICATE OFTEN AND WELL
It would be difficult to overestimate the value of effective communication between the project team and the sponsor and users, at all stages of the project.
Speak in the language of the sponsors and users. Relate all communications to their needs; applied model-based work is done to help them, not you.
Speak in terms of WHAT and WHY, not HOW. For example, speak in terms of inputs and outputs in preference to the technology of what happens in between. But -- and this is very important -- be prepared to help sponsors and users gain a good understanding of why the modeling results are what they are (one of my sermons on this, and a possible approach, is the subject of Geoffrion <1976>).
Sections 10.11 and 10.13 of Miser and Quade <1985> contain a cogent discussion of the importance of communication and how to do it effectively.
How one conceptualizes that which is to be modeled, and then designs
a model, set the stage for actually building and using a model. A good
conceptualization and design does not guarantee success later on, but a
bad conceptualization and design all but guarantee failure.
Most of the maxims given in what follows should be kept in mind not only during the conceptualization and design phase of modeling work, but also during the phases in which models are built, used, and maintained.
#8) KEEP IT SIMPLE
A useful conceptualization and model design ignores inessential details about the thing being modeled. There will not be excessive detail that obscures attending to what is really important. Moreover, complexity causes technical difficulties such as inefficiency and unmaintainability. It can easily cause a kind of paralysis that scuttles all chances of success.
Section 10.7 of Miser and Quade <1985> gives a cogent quote from H. Raiffa urging simplicity. See also page 117 of Hammond <1974>.
Here is a nice quote from C.S. Holling appearing in Miser <1989>:
Any model is a caricature of reality. A caricature achieves its effectiveness by leaving out all but the essential; the model achieves its utility by ignoring irrelevant detail. There is always some level of detail that an effective model will not seek to predict, just as there are aspects of realism that no forceful caricature would attempt to depict. Selective focus on the essentials is the key to good modeling.
Excessive simplicity is equally devastating, of course, but the kinds
of people who undertake model-based work seem genetically inclined far
more toward excessive complexity than toward excessive simplicity.
#9) BE CREATIVE ABOUT ALTERNATIVES
Even the most sophisticated optimizers and other model manipulation technologies that will be available a millennium from now will not be able to arrive at creative solutions and recommendations if the alternatives with which they must work are mundane.
There is nothing special about the training of OR/MS professionals which enhances their ability to be creative in devising alternatives for analysis. This trait must be cultivated throughout one's professional life.
See Section 10.7 of Miser and Quade <1985> for a good discussion of this point. They point out that truly creative alternatives sometimes get analysts into trouble with their clients, and that is one of the professional risks with which modeling professionals must live. I would add that, if you aren't ruffling a few feathers now and then, you probably aren't being creative enough in coming up with imaginative alternatives.
#10) SEPARATE GENERAL STRUCTURE AND INSTANTIATING DATA
General structure refers to a class of model instances that captures virtually all of the instances of possible interest, and that is convenient to work with. Each model instance can be represented as a particular general structure together with particular instantiating data.
Since general structure is dimension-independent and data- independent, it is less volatile and more compact than specific model instances. This implies that it is well-suited to such uses as mathematical analysis, auditing, verification, reuse, and being relatively easily understood and communicated. The main ideas in this paragraph and the preceding one are discussed in more detail in Geoffrion <1992>. See also "desired feature (e)" in Section 1.2 of Geoffrion <1987>.
The idea that reusability is an important property of general structure is the subject of another paper of mine that also recognizes the relevance of analogous ideas in software engineering (Geoffrion <1989>). A closely related idea is modeling by analogy. For example, Morris <1967> asserts that analogy or association with previously well developed logical structures plays an important role in modeling. This is really the "reuse" idea. Later he states the maxim: Seek analogies.
If thinking involves abstraction and abstraction is a major aim of general structure, then general structure derives some of its importance from the importance of thinking.
If general structure and instantiating data can be separated sufficiently, then it may be possible to use a modern database system to manage the data. This point has been recognized in recent years by a number of authors including Bürger <1982>, who presents a design for an LP system, MLD, that is one of the best-integrated in terms of an algebraic modeling language together with a supporting relational database system for data manipulation. Here is a pertinent quote:
In MLD, model solutions are obtained by binding a data module to a model module and executing it. This mechanism makes it easy to use the same data with different models, or to solve a model with different data. The binding between a class of models and a class of data modules is defined by a so-called execution module. The concept of an execution module has some interesting implications for an applied environment: model and execution module can be prepared by an 'expert', while data preparation, model execution, and analysis of the model results then can be carried out by a 'client'.
To put one of Bürger's points a bit differently, one could say
that general structure and instantiating data are developed and used for
different purposes, often by people with vastly different backgrounds;
to confound them is to inhibit efficient divisions of labor and to promote
Part of the concept of separating structure and data is the idea that the latter should be represented as a minimally redundant extension of the former. The reason is simply that it is inefficient to repeat general structure as part of the representation of instantiating data. Moreover, redundancy opens the door to inconsistency.
Separating logic and data preserves the integrity of models and datafiles and simplifies application development, maintenance and consolidation processes. (1986 sales brochure for IFPS/PERSONAL).
An analogous distinction (database schema vs. actual data) has been important in the field of database management since the early 1970's, and has been argued persuasively in that context (e.g., Date <1982>).
An analogous distinction also is important in the field of computer programming. The guest editor of a special issue of Computer says (Hong <1986>): In early programs, data and code were often intermixed. Modern-day programs generally have clearer separation between data and program constructs. This same point is a strong theme in Davis <1986>.
#11) SEPARATE MODEL, PROBLEM/TASK, AND SOLVER
I discussed these distinctions in Section 2.3 (see also desirable feature (b) of Section 1.2) of Geoffrion<1987>. Basically: (a) a model is a representation of some aspects of reality, (b) a problem or task describes something to be done with a model, and (c) a solver is a manipulator of a model according to some definite procedure for solving a problem or performing a task.
Perhaps the most obvious reason for the importance of the model/solver distinction is that it encourages the same model to be used with different solvers (perhaps to solve different problems or to carry out different tasks), and it encourages the same solver to be used with different models. Such reuse can save a lot of time and resources.
Another obvious reason for distinguishing between a model, a problem or task, and a solver is the pursuit of conceptual clarity. Non-specialists can easily become confused otherwise. For example, every consultant has had a client ask "Can you handle such-and-such a feature?". The true answer is often Yes and No: "Yes" in that you could include the feature in the model, but "No" in that an otherwise excellent solver would become inapplicable or inefficient. This answer will not be understood unless the client knows the conceptual difference between a model and a solver, and knows that a great variety of problems and tasks lies between the two. For another example, a user who is presented with an "LP model" may fail to understand that its database can be used for ad hoc retrieval, that the objective function and a constraint can switch roles in order to drive on a different criterion, or that the model can be used for static simulation on a casewise basis. It would be better for the user to be presented with a "model" on which many problems and tasks may be posed, some of which may include optimization performed with the help of optimizing solvers.
A more subtle reason for distinguishing between model and solver is that not doing so inevitably leads to predicating model design on a particular solver's limitations. This inhibits keeping track of modeling decisions that might warrant reversal if and when a more capable solver becomes available.
This maxim runs contrary to much common practice. For example, practitioners often state an optimization problem simultaneously with the model itself (e.g., "Maximize f(x) over x subject to g(x) = b, where x represents..."). It is a simple matter to describe the model itself first, and then to pose an optimization problem on the model.
... the equations used in stating the model should be kept separate from the method devised to solve them. (page 8 of Westerberg <1985>).
Analogous distinctions are important in the neighboring field of knowledge-based systems, where the knowledge base is analogous to the model and the inference engine is analogous to the solver. Speaking about such systems for the lay reader, Davis <1986> said
One of the distinguishing characteristics of these systems is the sharp distinction between the inference engine and the knowledge base. This division has two interesting consequences. First, it makes possible the substitution of a new knowledge base for a new task in place of the existing knowledge base, producing a new system as a result ... [using] the same inference engine. ... Second, it encourages taking an additional step along what has been called the 'how to what' spectrum ... encouraging the construction of a knowledge base containing what the program should know, rather than what it should do. ... The most important consequence of this distinction is that it enables the system to make multiple different uses of the same knowledge, facilitating explanation, knowledge acquisition, and tutoring. The separation of inference engine and knowledge base, and resulting multiple use of the same knowledge, is thus at the root of an expanded view suggesting that a program may do considerably more than compute an answer.
#12) MODEL THE DATA DEVELOPMENT PHASE
I sometimes make essentially this point in my classes by recommending that "upstream" calculations should be made fully explicit in spreadsheets and other modeling media.
From the point of view of the solver, especially if the solver is an optimizer, most of the data processing and computations done as part of the data development phase can be ignored once the final distilled model is made ready for the solver. The solver has no need to know of such preliminaries. Unfortunately, this actually happens all too often in the rush of real applications.
The point of this maxim is to recommend that this preparatory work be formalized to an appropriate degree, rather than being viewed as an annoying but necessary evil -- grubby work that can be forgotten as soon as it is done. The reason is simple: the truly essential nature of this much maligned work should earn it just as much attention, understanding, status, and documentation as are accorded the more glamorous analytical work that it makes possible. Thus it is necessary to formally incorporate data development logic into "the model".
Inadequate documentation of the data development process is a chronic complaint of modeling professionals. Modeling data development helps to assure its documentation, at least in a technical sense.
My intention here is to argue not only that data development should be modeled, but also that it should be modeled in the same language (or by the same formalism) used for the main part of the model at hand. For example, avoid representing a linear programming model in an algebraic modeling language at the same time as its source data are represented in a spreadsheet. Pick one formalism or the other (or one that is flexible enough to handle both parts with ease). The reason is that using two formalisms rather than one adds to both real and apparent complexity, and contributes to the Tower of Babel effect that plagues the modeling community.
If model specifications are executable, then modeling data development can automate some of the work that would have to be done in some other way (this is especially important when data development steps need to be executed multiple times).
#13) KNOW YOUR DATA
Every statistician knows that knowing the data well is essential in order to draw any meaningful conclusions. But, surprisingly, many people trained in OR/MS don't know this, at least until they have participated in a few projects. The reason may be that OR/MS people tend to be oriented toward mathematical manipulation of models, which nearly always assumes that the data are credible. Of course, they may not be.
Miser and Quade <1985> make this point forcibly:
... the central lesson of my experience: that what one knows about the supporting evidence will play a very large role in how the findings of the analysis are interpreted. This point argues against using data already gathered unless absolutely necessary, and certainly against using them without knowing how they were gathered and -- equally important -- how they were processed. In many cases systems analysts cannot avoid using data gathered elsewhere for other purposes ... but considerable effort should be devoted to learning how these data were developed, and what their strengths and weaknesses are, so that the findings of the analysis can take account of such knowledge. Perhaps one of the most important pitfalls of analysis is to put more credence in data than is warranted by the way they were developed.
#14) COMPOSE FROM SUBMODELS
Some people believe that composition (assembly) from tested submodels is the only way to successfully build a complex model. See, for example, Hogan and Weyant <1983>. See also pages 6 and 8 of Westerberg <1985>: ... to create really large models, one must assume that most of the model will already exist in the form of submodels stored within a library. ... The model should be created in parts, each of which can be tested. Once tested, the parts can become part of a larger model with some assurance that at least the parts are well formulated. ... Thus any language MUST support the development of models built up of previously created models.
Ideally, one wants to compose models from submodels in a formal rather than ad hoc way, within a unifying framework. Probably this requires that the component models all be expressed in a common formalism; if not originally, then in the course of composition. A modest literature is growing up around this approach, which usually is known as "model integration". One of the goals of this literature is to automate integration.
Analogous ideas often are discussed in software engineering. See, for example, the discussion in Cox <1986> on "Software ICs".
#15) CHOOSE NOTATION CAREFULLY
Notation can be immensely powerful if well designed. It can keep the user out of trouble, focus attention on what is important, be compact, and almost have an intelligence of its own. Careful attention to notation and representational devices in general is one of the hallmarks of good modeling.
#15a) MINIMIZE THE NEED FOR MANUAL LANGUAGE TRANSLATION
The three most common language types for representing models are plain English, mathematics, and computerese. All three are used in a typical model-based project, and often other types of languages are sometimes used as well. Each serves a useful purpose, but such multiple languages do have disadvantages, including:
a) Some people who play vital roles in model-based work will resist spending
the time needed to learn a new language that they don't already know, especially
if already well established in their careers. This makes communicating with
such people difficult.
b) It can take a lot of skilled time to translate back and forth among languages.
c) Multiple languages are redundant, which makes them susceptible to all of the diseases of redundancy, including inefficiency and the possibility of inconsistency when changes are necessary. See Fourer <1983> for an excellent discussion of such disadvantages in the context of LP optimization.
The disadvantages of multiple languages are mitigated if translations among
them can be accomplished automatically. For example, an executable modeling
language permits automatic translation into computerese. It may also permit
automatic translation into stylized but plain English model documentation, and
into mathematical notation of a relatively traditional sort.
#15b) USE MNEMONIC NOTATION
Choosing mnemonic names for the objects that arise in modeling work -- especially mathematical objects -- facilitates communication with others and enhances maintainability.
A good mnemonic name is no longer than necessary to be unique and remind the reader what it stands for.
#16) MAXIMIZE UNDERSTANDABILITY
It is important for models to be designed and documented in such a way as to facilitate demonstrating, to builders as well as sponsors and users, that they are "correct" in concept. This is almost the same as designing and documenting models in such a way as to make them understandable to all concerned.
The maxims that fall under this rubric contribute to understandability.
#16a) EXPLOIT PARALLEL STRUCTURE
When several things have a lot in common, understanding any one of them can be nearly the same as understanding them all. This makes it advantageous to treat them as a group, notationally as well as conceptually.
Indexing structures are the main means by which parallel structure is exploited notationally. This point is discussed in Section 1 of Geoffrion <1992>.
#16b) USE MODULES
A good way to deal with something complex is to break it down into its important parts or modules. If wisely defined, these modules are more easily understood and manipulated than the whole which comprises them. Thus models should be designed not as monoliths, but rather as interconnected collections of modules.
But don't take the modularization process too far, or you will arrive at an atomic view of the whole that is equally unmanageable. The atomic view obscures the forest for the trees. Important connections and similarities are not clear. A model built as a mosaic out of tiny pieces is boring and incomprehensible to all but the trained intellect able to catch the Gestalt of it. (Gestalt has been defined as "a structure or configuration ... so integrated as to constitute a functional unit with properties not derivable from its parts in summation.") Well modularized models don't depend on Gestalt; the properties of the whole are evident from the sum of the interconnected modules.
This maxim is related to the Compose From Submodels.
Modularization has come to assume a large role in software engineering, and some of the reasons for this are equally applicable in the context of modeling. This point is discussed in Section 1 of Geoffrion <1989>.
One of the maxims in Morris <1967> is: Factor the system problem into simpler problems. This is essentially the same as the concept of modularization. Later he says: The real objective of systems analysis is not simply to study larger and larger problems, but to find ways of 'cutting' large problems into small ones, such that the solutions of the small ones can be combined in some easy way to yield solutions for the large ones.
#16c) ORGANIZE THINGS HIERARCHICALLY
Hierarchical organization is widely practiced and recognized in many fields as an effective way to deal with complexity (which, in turn, is the bane of understanding and communication). Hierarchical organization should be applied to the conceptual units of any model and its documentation. That is, use the classical outline form.
For an example of hierarchical organization, one need look no further than the way Unix or Windows files are organized through directories and subdirectories.
#16d) DOCUMENT THOROUGHLY
Inadequate documentation has been identified again and again as a major factor contributing to failures of modeling projects and the premature demise of modeling systems. It impedes understanding, communication, maintainability, and necessary evolution.
See Gass <1984> and Section 10.9 of Miser and Quade <1985>.
A readable executable modeling language solves the "documentation problem" by dissolving the distinction between a model and its documentation. The model becomes its own documentation, and the documentation becomes the model.
#16e) AVOID FORWARD REFERENCES
Avoiding forward references when defining a model provides a measure of protection from circular definitions.
A document with no forward references can be read and understood in a single pass. There is no need to suffer the inefficiency and distraction of having to jump ahead into unfamiliar territory in order to obtain needed definitions.
Avoiding forward references tends to increase the sequentiality of a model's description, and consequently its "simplicity". The virtues of such simplicity have been argued in the related context of structured programming (e.g., Dahl, Dijkstra, and Hoare <1972>).
#16f) HIDE INESSENTIAL DETAIL
One of the basic principles of good communication is to avoid confusing the receiving party with irrelevant detail. Models often contain a great deal of detail, most of which is irrelevant in the context of any particular discussion or explanation; thus provision should be made to hide such irrelevant detail so long as it can be unhidden when necessary.
Tree-oriented editors (outliners) have the ability to hide and unhide subtrees. This is an excellent means for hiding inessential detail when a model's description is organized hierarchically (per another maxim).
The ability to hide detail selectively gives the ability to construct different "views" of a model appropriate to different audiences.
#16g) DRAW PICTURES
Graphical displays facilitate visualizing important aspects of a model.
Advocates of modeling paradigms that have natural graphical displays, such as decision analysis, graph theory, and network flow, argue strongly that such displays greatly enhance user understanding.
Some mathematical constructs, such as trees and directed graphs, lend themselves to graphic displays. To the extent that such displays can be understood by people without mathematical training, the difficulties of mathematical modeling are diminished.
Sen <1992> gives additional arguments in favor of pictures. A particularly interesting one is that a well drawn graph has associative relationships embedded in the adjacency of its features, and that this adjacency is not only apparent to viewers, but also can impart an inherent efficiency advantage to algorithms that process the graph.
#17) PLAN FOR CHANGE
Every applied model must accommodate change. The reasons for change include progression through the various stages of specification completeness as data are developed and revised, the evolution of general structure as the modeler gains experience and as requirements shift, and maintenance as unsuspected deficiencies are discovered or the modeling environment changes. Obviously, the flexibility to accommodate such changes is essential to the viability of the project.
It warrants emphasis that, as noted in the previous paragraph, change is not something that happens only after a model has been designed and built. It is a way of life while building a model, too. As Morris <1967> has stressed, the effective modeler constantly alternates between modification of the model and confrontation by the data.
The issue raised here is recognized as a major problem in software engineering, where maintenance costs often dwarf original development costs. Some of the remedies proposed in that context are applicable also to modeling.
#17a) DOCUMENT INTERDEPENDENCIES
Making changes in a model design raises the possibility of introducing inconsistencies. Most inconsistencies arise because the ramifications -- "inter dependencies" -- of what was changed were not understood well enough. Therefore, it is desirable for the interdependencies among the various components of a model to be visible and explicit in the model documentation or the model itself. Then when something changes it will be possible to work out what the change could possibly affect (directly or indirectly).
Explicitly documented interdependencies make it easy to determine exactly what parts of a model participate in the support or definition of any particular model component.
#17b) USE STEPWISE REFINEMENT
Stepwise refinement is a popular and much-used approach to the top-down development of complex systems in many fields. For example, it has been used in software engineering at least since the classic paper by Wirth <1971>. A system built in this way is more apt to be able to accommodate change than one that wasn't.
The processes of stepwise refinement are close to those of evolutionary change, so if a model and a modeling environment can support the former, they very likely can support the latter.
Morris <1967> views "enrichment" and "elaboration" as a basic modeling skill. These concepts are close to the notion of stepwise refinement.
#17c) AVOID STRUCTURAL AND DIMENSIONAL ARTHRITIS
The lower one goes in the
hierarchy, the more frequently changes must be made during the course of any given application's life-cycle. Arthritis -- that is, pain associated with change or movement -- generally becomes worse the higher one goes in the hierarchy. For example, it is usually easy to change a single number in a network flow model, a little more difficult to add a new class of nodes to a network flow model, quite disruptive to switch from the network flow paradigm to another paradigm like integer linear programming, and still more disruptive to switch from the operations research modeling tradition to a completely different tradition.
"Dimensional" arthritis has to do with changing model instances within a given model class. "Structural" arthritis has to do with changing model classes within a given modeling paradigm, or even changing paradigms within a given modeling tradition. How one designs a model and implements it can influence susceptibility to arthritis, so this provides an important consideration.
The key to avoiding both kinds of arthritis is to use a high-level modeling language that cleanly separates model structure (the essence of "model class") from data (the essence of "model instance"). See the maxim on this.
#18) MATCH MODEL REALISM WITH SOLVER CAPABILITY
Although I have argued elsewhere that models and solvers should be strictly separated, they must be well matched to one another if a modeling application is to be successful. After all, what is the point of designing a model if the end result cannot be solved satisfactorily?
Unfortunately, greater model realism usually means less tractability by available solution technology. This forces the successful modeler to make a conscious trade-off between model realism and mathematical/computational tractability.
Morris <1967> argues for consciously making this trade-off in an iterative way.
Simplicity, the subject of another maxim, is a relevant consideration.
#19) WORK WITHIN ESTABLISHED STANDARDS
The point of this maxim is to advocate the use of standardized modeling paradigms and representational styles.
Other things being equal, there are always advantages to reducing the number of different modeling paradigms or representational styles when modeling: communication is facilitated because fewer paradigms or styles must be learned or remembered by those who must work with them, and model integration is facilitated because more of the things to be integrated are expressed in a common framework.
#20) PREVENT ERRORS BEFORE THEY OCCUR
As the manufacturing strategists say, "Quality should be built in, not added on". In the context of modeling, this means that an attempt should be made at the model design stage to anticipate the kinds of errors likely to occur during the building of a model. Preventative measures should be taken whenever possible.
One specific suggestion is that, when specifying a model class of interest, try to make it as tight as possible. Then any data leading to a model instance that falls outside of the specified class should be automatically recognizable as such (e.g., a parameter specified to be a nonnegative integer rather than simply a real is more resistant to specification error). Moreover, a tight model class often enables automatic data entry by the modeling environment, which of course precludes manual entry errors (e.g., specifying index sets by formula whenever possible -- as discussed at length in Geoffrion <1992> -- makes manual entry unnecessary).
A second suggestion, made by Morris <1967> among others, is to look at very simple numerical instances at the model design stage. This helps to reveal bugs before a full-scale model has been built. Morris also notes other advantages of this practice. Not only should the modeler do the looking, but some of the users as well (in keeping with the maxim to involve the user at all stages).
Many of the other maxims will reduce the likelihood of model-building errors if followed. For example, the beneficial effects of simplicity, mnemonic notation, good documentation, and working within standards are obvious.
See Bisschop <1987> for a paper-length treatment of this topic.
Bisschop, J. <1987>. "Language Requirements
for A Priori Error Checking and Model Reduction in Large-Scale Programming,"
Proceedings of the NATO Advanced Study Institute on Mathematical Models for
Decision Support (July 27 - August 6, 1987 in Val d'Isere, France).
Bürger, W.F. <1982>. "MLD: A Language and Data Base for Modeling," Research Report RC 9639, IBM T.J. Watson Research Center, Yorktown Heights, September 14.
Cox, B.J., Jr. <1986>. Object-Oriented Programming, Addison-Wesley, Reading, MA.
Dahl, O.J., E.W. Dijkstra, and C.A.R. Hoare <1972>. Structured Programming, Academic Press, London.
Date, C.J. <1982>. An Introduction to Database Systems, Addison-Wesley, Reading, MA.
Davis, R. <1986>. "Knowledge-Based Systems," Science, 231 (February 28), pp. 957-963.
Fourer, R. <1983>. "Modeling Languages Versus Matrix Generators for Linear programming," ACM Transactions on Mathematical Software, 9:2 (June), pp. 143-183.
Gass, S.I. <1984>. "Documenting a Computer-Based Model," Interfaces, 14:3 (May-June), pp. 84-93.
Geoffrion, A.M. <1976>. "The Purpose of Mathematical Programming is Insight, Not Numbers," Interfaces, 7:1 (November), pp. 81-92.
Geoffrion, A.M. <1987> "An Introduction to Structured Modeling," Management Science, 33:5 (May), pp. 547-588.
Geoffrion, A.M. <1989>. "Reusing Structured Models via Model Integration," Proceedings of the Twenty-Second Annual Hawaii International Conference on System Science, Kailua-Kona, Hawaii, January 3-6, IEEE Computer Society Press. Reprinted in Current Research in Decision Support Technology, R.W. Blanning and D.R. King (Eds.), IEEE Computer Society Press, 1992.
Geoffrion, A.M. <1992>. "Indexing in Modeling Languages for Mathematical Programming," Management Science, 38:3 (March 1992).
Hammond, J. <1974>. "Do's and Don'ts of Computer Models for Planning," Harvard Business Review, 52:2 (March-April), pp. 110-123.
Hogan, W.W. and J.P. Weyant <1983>. "Methods and Algorithms for Energy Model Composition: Optimization in a Network of Process Models," in B. Lev (ed.), Energy Models and Studies, North-Holland, Amsterdam.
Hong <1986> "Guest Editor's Introduction," Computer, 19:7 (July), pp. 12-15.
Miser, H.J. <1989>. "The Craft of Operations Research," Operations Research, 37:4 (July-August), pp. 669-672.
Miser, H.J. and E.S. Quade <1985>. Handbook of Systems Analysis, North-Holland, New York.
Morris, W. <1967>. "On the Art of Modeling," Management Science, 13:12 (August), pp. B707-B717.
Peters, T.J. and R.H. Waterman, Jr. <1982>. In Search of Excellence, Harper & Row, New York.
Sen, T. <1992>. "Diagrammatic Knowledge Representation," IEEE Transactions on Systems, Man and Cybernetics, 22:4, pp. 275-291.
Westerberg, A.W., <1985>. "Aids for Engineering System Model Formulation," Working Paper, Dept. of Chemical Engineering, University of Wisconsin, Madison, presented as the second of three Hougen Lectures, February 27.
Wirth, N. <1971>. "Program Development by Stepwise Refinement," Communications of the ACM, 14:4 (April), pp. 221-227.