Structured Modeling: Survey and Future Research Directions, Section 5

5. IMPLEMENTATION STRATEGIES AND TECHNOLOGIES

As mentioned earlier, a comprehensive foundation for structured modeling has been laid on which future production prototypes can be built. Toward that end, it is now appropriate for research emphasis to be placed on implementation strategies and technologies. We discuss three topics in this vein: host software, semantic formalization and language-directed editors, and language-based customization of specialized application packages.

5.1 Host Software

There are two basic ways to build new modeling environment implementations: program them from the ground up, incorporating ready-made components as possible and appropriate, or build them on top of existing software that already provides much functionality of value to users. Because of the very wide variety of functionalities required by modeling environment users over the typical modeling life-cycle (see, e.g., Geoffrion [1989b], [1989c]), the second approach seems much more practical than the first. Hence we consider now some of the possibilities for host software: integrated software suites, relational database systems, extensible database systems, and object-oriented database systems.

FW/SM's host was Framework IV, a DOS-based integrated personal productivity package outstanding in its day but now dated. At the time of this writing, an implementation in the spirit of FW/SM would likely choose as its host an integrated software suite such as Microsoft Office Professional, which offers spreadsheet, word processing, presentation graphics, SQL database access and management, electronic mail services, and more under Windows. There is good user interface consistency and strong interprocess communication capabilities through dynamic data exchange, object linking and embedding, and other standards set forth in Microsoft's Windows Open Services Architecture. Excellent development tools are available, as are many compatible programs by vendors active in the Windows market.

No comprehensive implementation hosted by an integrated software suite exists today, although the GESCOPP project made progress in this direction.

Another attractive host is a good relational database management system (RDBMS). There are three major structured modeling implementations of this sort (see also the proposal by Dolk [1988a]). One, Lenard's ORACLE-based system for simulation, already has been mentioned.

The second is DAMS (Decision and Algebraic Management System), built on top of Quadbase/SQL and the POET object-oriented DBMS; see Ramirez [1995]. DAMS is accessed through sublanguages collectively called SM/DB; one based on structured modeling provides model definition and manipulation, and SQL serves as the data sublanguage. Access can be interactive or from a programming language such as C++. "Mappings" allow multiple models to share tables containing instance data.

The third is an elaborate system catering to statistical databases called OR/SM (ORACLE/Structured Modeling); see Wright et al. [1997]. The subject of a doctoral dissertation (Worobetz [1992]), OR/SM was implemented in C, with ORACLE (including SQL*Forms, SQL*Menu, and SQL*Plus) as the host system. Models are composed in Microsoft Word using SML extended to include vector-valued generic rules, with parts of level 4 SML omitted and with minor modifications to exploit SQL's potential for efficient evaluation and retrieval. The schema and data are input through user interaction within a completely menu-driven system. Statistical data analysis, also menu-driven, is carried out via a largely transparent PC-SAS interface, and a SAS/OR interface handles optimization models. OR/SM is extensively documented (Kang et al. [1997]), and has been used successfully by many students.

See also the RDBMS-based implementation of Heavey and Brown [1997], which incorporates two SML extensions and was applied to queueing network models in manufacturing.

An intriguing advantage of building a structured modeling environment on top of a RDBMS is the possibility of using its query engine to carry out the evaluation operation, that is, for evaluating function and test element values. This operation, in the extended sense that includes computing explicit index sets whose populations are given by formula, is a crucial operation in computer-based modeling environments having algebraic executable modeling languages (EMLs). It is needed to answer even the simplest user requests for basic computations on models, and also for such purposes as loader/editors that understand model structure, model debuggers, solver interfaces, functional style query languages, and report writers.

Although every structured modeling implementation that chooses a RDBMS host makes extensive use of the query engine, none yet does so in a way that fully exploits the potential efficiency for evaluation inherent in the elaborate query optimization implemented by such engines. Neustadter [1994] points the way to changing that, not only for structured modeling environments, but also for modeling environments that choose algebraic EMLs other than SML. She defines a generic data model that is representative of algebraic EMLs, defines a compatible extended relational data model, and views EML expression evaluation as queries over data tables associated with this extended relational model. From this viewpoint she develops and proves correct an expression translation algorithm from the first data model to the second that eventually should enable evaluation to be carried out efficiently for large models through incremental and optimally organized computations much as in RDBMS query optimization. She checks the framework in detail for applicability to two contemporary EMLs, namely AMPL and SML. See also Lin [1993] for a related approach.

Useful as they are, relational DBMSs nevertheless have their limitations as modeling environment hosts. These limitations have been studied by Desai [1992] for extensible database systems and by Huh [1992] for object-oriented database systems. Each examines in detail the potential of these new types of database systems as hosts for a modeling environment that supports algebraic modeling languages such as SML.

Desai formalizes the essence of what it means for a database system to be "extensible" and develops a suitable data model within this formalization consisting of data structures, operators, and structural constraints. She gives attention to data access strategies for good performance and to version management, and reports on a partial implementation using the experimental system EXODUS. She also gives a comparison of extensible versus object-oriented database systems as platforms for modeling environments.

Huh's concern is with object-oriented modeling environments that simultaneously support multiple algebraic modeling languages. In particular, AMPL, GAMS, and SML are considered in detail. In addition to models in the traditional sense, defined in an object-oriented version of the input-output style of the systems approach, Huh also postulates "functional models" that encapsulate solver functionality. He has completed a prototype implementation using C++ and ObjectStore on a Sun-4. Huh [1993], Huh and Chung [1995] and Huh, Kim and Chung [1998] describe related ObjectStore prototypes incorporating SM.

See also the work of Davis, Srinivasan and Sundaram [1995], [1998] on implementing structured modeling using the object-relational DBMS Illustra, the commercial version of POSTGRES. The work of Ruark [1998], also object-oriented, is promising in a different way.

Thus, it is clear that a variety of attractive host software platforms are available to implementors of new structured modeling environments. Much could be done to adapt and exploit their potential.

5.2 Semantic Formalization and Language-Directed Editors

Vicuña [1990] wrote a dissertation on semantic formalization. He observed that most existing modeling languages for mathematical programming have semantics that are only incompletely formalized. This situation -- which he studied in detail for AMPL, GAMS, and LINGO -- inhibits efforts to achieve a high level of automation for diagnosing errors and generating major components of a computer-based modeling environment (e.g., language-directed editors, type inference tools, and immediate expression evaluators). He demonstrated that attribute grammars furnish a suitable declarative formalism with which to specify the semantics of SML, and a similar development should be possible for most other mathematical modeling languages. As mentioned earlier in our discussion of model integration, Vicuña demonstrated the feasibility of his approach with a fully operational language-directed editor for SML on a UNIX workstation. Moreover, he addressed both the automatic detection of missing language constructs and the immediate evaluation of numerical and logical expressions.

Language-directed editors are possible not only for text-based languages, but for other types as well. For example, see Jones [1992], Chari and Sen [1997], [1998], Hamacher [1995], and Yeo [1997] in the context of graphical languages for SM. Such editors deserve further development, as they are one of the keys to broader acceptance of modeling tools.

5.3 Language-Based Customization of Specialized Application Packages

Almost every real application of optimization requires a model that is unique in some ways. Either a compatible special-purpose optimizer is used, or a general-purpose optimizer is used -- possibly in a manner that exploits special model structure. The skilled practitioner usually can see many other similar applications in the target organization, and in other organizations, that could be solved similarly. But doing so typically would require customizing the software to some degree, and this can be very time-consuming. The point of this topic is to suggest how customization can be accomplished much more quickly (Geoffrion [1992b]).

The starting point is to recognize that the type of customization required to move from one application of a given kind to another of the same kind is quite different from the type of customization required to move to a different kind of application. The difference is that the first type of customization involves only what might be called the model's "surface" structure, whereas the second involves what might be called "core" structure. A model's core structure is whatever aspects of its mathematical structure truly impact a particular optimizer's ability to solve it; everything else is surface structure.

This distinction can be clarified by a familiar example, namely the classical transportation model. In its usual simple form, this model consists almost entirely of core structure. But if embellished with new model features that use additional data to calculate the capacities of the origins, the demands of the destinations, and the magnitudes of the unit shipment costs (perhaps as weighted averages of different freight rates), then all of the new features would be surface structure.

It turns out that surface structure can be customized relatively easily with modeling language-based tools, while this is much less true for core structure modification at the present state of the art. Consequently, I propose a new way to build specialized application packages when execution efficiency is an issue and customized variants are of interest.

In particular, I propose that the interface to the optimizer should be hand crafted for efficiency in such a way that it presumes knowledge of core structure but not knowledge of surface structure. I further propose using SML (or some other model definition language for structured modeling) to describe surface structure, so that SML-based tools can be used to provide many of the functionalities needed for the application at hand. Many such tools are a part of the UCLA prototype FW/SM, as described in Geoffrion [1991].

If this implementation strategy works as intended, then it should be much easier than usual to improve a given application over time as opportunities to improve surface structure become evident (as they always do) and as changes are mandated by external events. It should also be much easier than usual to create new but similar (i.e., same core structure) applications, because the labor-intensive part of the implementation need not be redone.

In both cases, the advantage of this strategy is that it helps to fulfill this dictum of a good implementation: The most volatile aspects of a model should be the easiest to change. [In order of diminishing volatility, the components of a model may be taken as (1) data values, (2) index set contents, (3) surface structure, and (4) core structure. Proper use of database technology can render (1) and (2) easy to change, and the point of the proposed strategy is to make (3) easy to change also.] Moreover, one of the larger difficulties with most of today's modeling languages for mathematical programming will be neatly skirted, namely their inability to automate properly the model/solver interface when special model structure needs to be exploited for efficiency's sake. Finally, one may hope that this strategy will improve the application economics of the many highly specialized algorithms that populate the journals, to the extent that some will become viable in practice for the first time.

Appendix A of Geoffrion [1992b] further clarifies the distinction between core and surface structure, and gives additional details on this implementation strategy. There are many ways in which research and development efforts could be based on this strategy. One such is described by Gazmuri, Maturana, and coworkers in Chile (see [1993] and the more recent references cited therein), whose large project is trying out these ideas together with others in the context of real production and distribution applications in manufacturing. There is also a similar but smaller sponsored project involving some of the same people.

Our discussion of this topic has been limited to models where the key solver is an optimizer, but the strategy suggested here should be just as applicable to other kinds of solvers.

Back to Title Page
Back to Previous Section of Paper
Forward to Next Section of Paper