Unpublished.

An Engineering Ontology Markup Language

Jeffrey C. Lockledge 
Department of Industrial and Manufacturing Engineering 
Wayne State University 
USA
Filippo A. Salustri, P.Eng. 
Industrial and Manufacturing Systems Engineering 
The University of Windsor 
Canada

Abstract

Researchers and practitioners in the area of engineering design are looking for innovative, effective ways to use computers to increase the quality and cost effectiveness of engineered products.  This paper presents the idea for a new system to represent design knowledge using XML.  The system, the Engineering Ontology Markup Language (EOML), is intended to allow the description and specification of ontology information regarding designed products.  By employing XML, it is hoped that EOML will provide (a) substantially improved integration of various aspect of modern engineering projects which tend to be highly collaborative and distributed, and dependent on vast quantities of information and knowledge; and (b) leverage existing and soon to exist Web-based technologies in order to minimize implementation costs and risks to practicing engineering groups.  Rather than dwelling on details of EOML, this paper presents the case for XML-based systems for engineering ontologies and discusses our approach to develop EOML.

Introduction

As the state of the art in computation advances, researchers and practitioners in the area of engineering design are looking for innovative, effective ways to use computers to increase the quality and cost effectiveness of engineered products.  Due to the highly collaborative and distributed nature of modern engineering enterprises, it has been natural for both researchers and practitioners to look to the internet and the World Wide Web (WWW).

Until recently, efforts in this area focused on using relatively low-level transfer protocols (e.g. FTP, NFS, etc.) to share data.  Various initiatives such as CALS[1] and NIIIP [2] have sought to synthesize information transfer systems that specialize these more generic forms to meet the requirements of engineering.

Now, however, the focus of attention is moving away from these initiatives and towards the use of newer tools (particularly Java and CORBA).  The latest entry into the list of standards for internet-based information transfer is the Extensible Markup Language (XML) [3], which allows the specification of specialized document structure.  The authors believe that XML can form the foundation of a powerful new approach to specifying engineering design knowledge in a way that is inherently internet-enabled.  However, due to the intentional generality of XML and the specific requirements of engineering design, significant work must be done to specify how XML can be used in this domain.  This paper sets forth an overview of the kind of system the authors are constructing, explaining its motivation and justification, and introducing the major features expected to be present in the completed system. We call the evolving system the Engineering Ontology Markup Language (EOML).

The rest of this paper is organized as follows.  A brief overview of XML and its components is presented, along with a basic rationale for its use to represent design knowledge.  The next section presents the authors' ideas of how XML can be used in this domain, and lays out the foundations of the structures needed to support it within EOML.  A discussion of future work is then presented, with emphasis on defining a development path to get from this project's current state to a useful, "industrial strength" application for use by engineering practitioners.  Finally, some conclusions of the authors' current work and experiences in this area are given.

XML in Engineering Design

This section presents a brief overview of XML and discusses the potential benefits of its application to the specification of design knowledge.

XML is somewhat misnamed: is it not a markup language itself, but rather a language for developing markup languages, a meta-markup language.  HTML is one markup language that can be defined in XML; other such languages tuned to specific application domains can also be developed, and any XML compliant system should be able to read and parse that document, no matter what the specific markup language is.

XML is not a single standard; rather it consists of three components (so far). XML itself defines the syntax and grammar of a document's structure.  Tags are used to set off markup entities from the actual content of the document.  A Document Type Definition (DTD) defines what tags can appear in a document, how those tags can be nested, and what attributes each tag can have.

XML makes no commitments about presentation appearance of documents; XSL, the Extensible Style Language [4], handles this.  By separating content specification in XML from presentation specification in XSL, XML-compliant applications can be built that use document content without necessarily presenting it (e.g. a case-based reasoning engine) without the overhead associated with handling presentation matters.

Finally, linking documents together will eventually be handled by XLL, the Extensible Linking Language[5].  In the current specification of HTML, linking is done only through the <a> tag.  XLL will allow a far richer functionality, including multi-directional links, linking t other kinds of entities than the usual URLs (Uniform Resource Locators), kinds of links (e.g. table of contents, section headings, indices, etc.), and alterable behavior (the action that occurs when a link is activated).

Clearly, XML is strategically placed to provide a mechanism for transferring structured design information over the internet.  If this were possible, then various Web-based search/query systems could be developed to provide access to that information across a broad spectrum of user communities, thus helping to integrate otherwise disparate segments of industry.  This kind of integration is one of the fundamental advantages that an XML-based approach to design knowledge specification would achieve.

The other major benefit is that this approach leverages all of the existing (and soon to exist) technology of Web-based information transfer.  The infrastructure (the internet) is already in place; many fundamental tools, such as browsers, query systems, etc. either exist or are currently under development.  There is little doubt in the internet community that XML will become the critical tool for Web-based communications.  A design knowledge specification standard such as EOML will take advantage of all this expertise and technology, thus lowering the cost and risk associated with moving to this new technology.

However, the most fundamental question is: what is the nature of the structure that EOML would have to exhibit to be useful in engineering applications?  Without a sufficient answer to this question, little else can happen.

Let us consider a very simple example of representing information about automobiles.  It might be represented in EOML as follows:
 

<full-size-car> 
  <name>Taurus</name> 
  <Color> 
     <RGBColor> <Red>255</red> 
               <Green>0</Green> 
               <Blue>0</Blue></RGBColor></color> 
  <InteriorColor> 
     <RGBColor> <Red>0</red> 
               <Green>0</Green> 
               <Blue>0</Blue></RGBColor></InteriorColor> 
  <engine><number-of-cylinders>6</number-of-cylinders></engine> 
  <transmission><speeds>4</speeds></transmission> 
</full-size-car> 
Figure 1.  Example XML Describing Engineering Data

The structure in the above example is provided by XML tags such as <full-size-car>.  But these tags are specific to a particular class of engineered product and are not found in conventional XML documents (such as HTML).  Given some description such as that above, how does an application know that a Color  is a meaningful element of a FULL-SIZE-CAR?  In order to support this kind of structure, there must be some way of specifying both what special tags can exist in a document to describe the structure and what those tags mean.  In other words, a mechanism must exist for capturing ontologies of design knowledge.

An ontology is a formal structure that provides a deep categorization of knowledge so that it can be reasoned with at various levels of abstraction; in essence, it provides the means to associate a semantics with a set of terms that denote categories of entities of interest.  EOML will allow the construction of ontologies for knowledge about engineered products.  But in order to define the system, we must have an understanding of the kinds of XML entities the system will manipulate in order to represent the ontologies.

Developing the logical framework for an ontology building system such as that proposed herein goes beyond the scope of this paper and, in any event, it is a matter currently under development by the authors.  The emphasis here is on the question of whether XML is able to represent these ontologies; the exact form of the ontological constructs does not matter here.  However, insofar as the XML-based system described in the next section takes advantage of some of the authors' work in ontologies for engineering [6], some discussion of that work is relevant here for expository purposes.

In the authors' work, an entity can be either an object or a classObject and class are taken to mean roughly what they mean in a typical object-oriented framework, except that we consider them to be disjoint types.  Entities are named, and those names are used to identify entities.  But it is the case that the same name may be used to refer to different entities under different circumstances.  The authors therefore use the notion of context to group terms (name/entity pairs) into collections.  A network of terms forms a unit of knowledge.  Within a context, a given name represents only one entity, but the same name may be used to identify different entities in different contexts.  Contexts may (and usually are) nested; the "outermost" contexts contain terms that are commonly understood, whereas the "innermost" contexts contain terms specific to particular applications, agents, problems, tasks, etc.  Substantial effort is currently being put into formalizing the notion of context [7].  The importance of context has also been recognized by the WWW Consortium, who are currently investigating the notion of namespaces (essentially the same as contexts) [8].

Entities can have attributes of three kinds: properties, components, and abstractions.  A property is an inherent characteristic whose value cannot be derived from any one component or abstraction of the entity; for example, mass, shape, and size are properties.  A component is an entity that may appear as an attribute value of another entity; this kind of relation is called a part/whole relation, and the study of these kinds of relations, called mereology, is an emerging field in the area of knowledge representation [9].  It seems obvious that mereology will play a very important role in the development of ontologies for engineering.  There are many different kinds of part/whole relations, and there is currently no clear mechanism to categorize and formalize them; however, the matter is being vigorously pursued by researchers.  Finally, abstractions are those attributes used to relate objects to classes, and subclasses to superclasses.

Given the ontological entities described very briefly above, a mapping to the kinds of constructs available in XML is now needed.  The major constituents of an XML document are elements, used to Denmark regions of a document that are to be treated, somehow, as a unit.  Elements are demarked by tags that identify the kind of element and any special attributes that an element may have.  Since XML deals only with syntactic constructs, it distinguishes between content that is parseable and content that is not parseable.  Content that is not parseable is delivered unmodified through the parser to an underlying applications (such as a browser).  This allows content that is strictly present for semantic purposes to be passed through the syntactic XML component.

XML-based knowledge representations must clearly distinguish between the syntactic and the semantic components of an ontology.  XML can be used to establish if an ontology is syntactically well-formed, whereas the underlying application must be responsible for the semantic analysis of the ontology.

Developing Knowledge Representation in XML

This section contains a DTD for a contextually oriented ontology and an example of its use (See Figure 2).  A document describing an ontology must always start with the context, or contexts, in which the ontology applies.  This is necessary as different individuals and organizations have their own ontological constructs that must be respected if meaningful knowledge transfer is to occur.  For example, if two organizations work in automotive repair, one in body repair and the other in engine repair, they will each see the car differently.  The body repair shop thinks of the engine compartment as an area of little interest, but considers the hood a significant component.  The engine repair shop, on the other hand, sees the hood as an obstacle to be overcome in reaching the engine.  Since this is the case, each can be expected to have a great deal of detail about the objects in which they have an interest, and little or none in other area.  If the two organizations have to communicate to solve a mutual concern they must be sure that the terms, their semantics, and the knowledge structure are all in correspondence before communication can take place.

The authors expect that translators between contexts can be developed however this is outside the concern of this paper.

The second section of the DTD describes the structure of the ontology.  The ontology has three main components: class definitions, an abstraction  mechanism, and a collection mechanism.  This allows the author to create an ontology with their own classes of objects and the knowledge structure which is appropriate to their application.
 

<!ELEMENT context (ontology | context)+ >
<!-- A single context may have multiple ontologies or sub-contexts but must have at least one. -->
<!ATTLIST context
   name id #REQUIRED>
<!-- Contexts require a name. -->

<!ELEMENT ontology (classdef | abstraction-of | contains)+ >
<!-- An ontology is a set of classes, abstraction-of relationships, and part-of relationships with a cardinality of at least one. -->
<!ATTLIST ontology 
     name id #REQUIRED>
<!-- An ontology must have a name. -->

<!ELEMENT abstraction-of (classref , classref+)>
<!-- One or more classes belong to another class. This allows us to say: person abstraction-of fred, sally, tom This is just for notational simplicity... this could be written fred abstraction-of person, sally abstraction-of person, tom abstraction-of person etc.  -->

<!ATTLIST abstraction-of
     kind (Specialization | Sub-Class | Similarity |
            Instance-of) #implied 
     variance (covariant | counter-variant) #implied>
<!-- Abstraction-of relationships may have a type or a variance. Neither is required, both have a limited domain. If none is supplied it's value is given as "unknown". -->

<!ELEMENT contains (classref , classref+)>
<!-- One or more sub-classes are contained by a class. -->
<!ATTLIST contains
     type cdata #implied
     requirement (necessary | possible) #implied>
<!-- A contains relationship may have a type - this should be an enumerated type but mereology has yet to define these clearly. A contains relationship may have an attribute defining the necessity of having each contained part.  "Necessary" means the part is required for the whole.  "Possible" means the part can be there or not, but its existence doesn't effect the validity of the whole.  -->

<!ELEMENT classdef  (attribute)*>
<!ATTLIST classdef
   name id #REQUIRED>
<!-- A class definition can have attributes (though it isn't required to have any).  A class must have a name. -->

<!ELEMENT attribute (name , type)>
<!-- An attribute must have a name and a type. -->

<!ELEMENT name (#pcdata)  >
<!-- The name is a word describing the attribute.-->

<!ELEMENT type (classref | primitive) >
<!-- The type is a description of the kind of values the attribute can hold. These are either other objects or one of the primitive types (e.g. integer, real, string, etc.) -->

<!ELEMENT classref empty >
<!ATTLIST classref 
      name idref #REQUIRED>
<!-- This just references a class that has been defined elsewhere -->

<!ELEMENT primitive (#pcdata) >
<!-- A primitive type.  -->
 

Figure 2.  The DTD for a Contextual Ontology Description

A class definition is made up of zero or more  attributes, which allows for the possibility of an object which has no attributes.  This would be required in the situation mentioned earlier in which individuals (e.g. those in the engine repair shop) recognize that an object (e.g. the hood) exists but is of little intrinsic interest because it is outside the scope of their work.

The term used for abstraction, abstraction-of, is different that the term typically used in object oriented programming, is-a.  The EOML's abstraction mechanism is more specialized than it is in traditional object-oriented languages (OOL).  This is one of the things that separates an ontology from a OOL.  From a purely programmatic standpoint, it is irrelevant why a set of concepts can be abstracted by another.  From the standpoint of knowledge representation, and the reasoning that may be done from it, the differentiation is significant.  For example, a specialization of one object from another means that there is a well known taxonomic structure at work which is recognized among may contexts.  A similarity, on the other hand, may be completely coincidental, and/or purely a function of the current context.

The collection mechanism, which is referenced by the contains clause, allows the author to indicate when a class has, as part of its identity, a set of other classes.  This would permit, for example, a user to have a car that has four wheels and potentially a radio.  In product engineering it is particularly important to have a mechanism to include such optional features.

The following (Figure 3) is an example ontology definition.  It is created using the DTD given above and describes the relationship between a car, specializations of that car, and a subset of its components.  The reader should realize that a real ontology would be much larger and this is only given as an example.  The ontology defined in Figure 3 could be used to generate the example given in Figure 1.
 

<context name="The-Large-Car-Company">
 <ontology name="Car-Structure">
  <classdef name="Economy" >
   <attribute><name>Color</name>
    <type> <classref name="RGBColor" /> </type></attribute>
  </classdef>
  <classdef name="Full-Size-Car">
   <attribute><name>Color</name>
    <type><classref name="RGBColor" /></type></attribute>
   <attribute><name>InteriorColor</name>
    <type><classref name="RGBColor"/></type></attribute>
  </classdef>
  <classdef name="Car">
   <attribute><name>Name</name>
    <type><primitive>string</primitive></type></attribute>
  </classdef>
  <abstraction-of kind="Specialization">
   <classref name="Car" />
   <classref name="Full-Size-Car" />
   <classref name="Economy" /></abstraction-of>
  <contains><classref name="car" />
   <classref name="Engine" />
   <classref name="Transmission" /></contains>
  <classdef name="Engine">
   <attribute><name>Number-of-Cylinders</name>
    <type><primitive>integer</primitive></type></attribute>
   <attribute><name>Number-of-Cylinders</name>
    <type><primitive>integer</primitive></type></attribute>
  </classdef>
  <classdef name="Transmission">
   <attribute><name>Speeds</name>
    <type><primitive>integer</primitive></type></attribute>
  </classdef> 
  <classdef name="RGBColor">
   <attribute><name>Red</name>
    <type><primitive>integer</primitive></type></attribute>
   <attribute><name>Green</name>
    <type><primitive>integer</primitive></type></attribute>
   <attribute><name>Blue</name>
    <type><primitive>integer</primitive></type></attribute>
  </classdef>
 </ontology>
</context>
 
Figure 3:  An Example Ontology

Related Work

Due to the rate at which Web technology is growing, it is difficult to remain current on all the latest developments.  Nonetheless, some other related efforts have come to the attention of the authors, and are noted here.

The XML-Data effort [10] provides the means to define the characteristics of classes of objects, including those that define concepts and relations between concepts.  As such, there is a potential for this initiative to be useful in developing ontologies.  However, it is unclear to the current authors at this time whether XML-Data will support a rich enough environment to capture the specific kinds of knowledge necessary in engineering applications.  In any event, should XML-Data be sufficient, a significant amount of work is still needed to provide mechanisms for developing, validating, and using its schemas (mechanically similar to ontologies of the current authors).

The Chemical Markup Language (CML) is one of many specific applications (others include Bioinformatic Sequence Markup Language, Wireless Markup Language, and Mathematical Markup Language) that must define specialized XML DTD's to support their application domains.  But these applications work within fairly restricted domains well grounded in physical reality.  On the other hand, there is a virtually unlimited number of ontologies possible to describe engineered products.  Thus these efforts, though tremendously important, are insufficient for the authors' purpose.

Future Work

A great deal of work remains to be done to develop EOML to the point of making it a serviceable component of Web-based computing for engineering applications.  In terms of the standard itself, work goes on along three paths.  First, the logical models underlying our ontology mechanisms continue to be developed to incorporate notion of function modeling and a better notion of quantity (i.e. dimensional information).  Second, we must further refine the DTD for our ontological structures.  Various case studies, primarily set in the area of automotive engineering, will be investigated.  Finally, applications that will employ EOML as a communications protocol, including an underlying knowledge-based system for semantic interpretation of EOML documents is being constructed.

Ultimately, the authors envision a collection of tools that will allow:

  1. the graphic display of XML-based descriptions of design knowledge based on some schematic or diagrammatic method;
  2. the editing and management of these descriptions through a combined graphics/text-based editor; and
  3. the querying of remote, distributed databases of design knowledge.
Furthermore, we hope the system will find use as a communication mechanism for intelligent design agents and knowledge brokers, software entities that will be responsible for maintaining, distributing, and answering queries about distributed knowledge bases of engineered products.

Conclusions

This paper has sketched a proposal for an system for the description of ontologies for engineered products based on XML.  It seems obvious that XML will become a key standard for Web-based information exchange.  Since engineering projects are becoming more and more distributed, highly collaborative efforts involving huge amounts of information and knowledge, such a tool could be of tremendous use in the development of next-generation computer-aided engineering tools.  Our experience with XML to date indicates that EOML is quite feasible.

References

  1. Continuous Acquisition and Life-Cycle Support (CALS) Standards, http://navycals.dt.navy.mil/calsstds.html.
  2. National Industrial Information Infrastructure Protocols (NIIIP), http://niiip.org.
  3. T. Bray, J. Paoli, and C. M. Sperberg-McQueen, eds., Extensible Markup Language (XML) 1.0 - W3C Recommendation 10 Feb 1998 http://www.w3.org/TR/1998/REC-xml-19980210.
  4. S. Adler, A. Berglund, J. Clark, I. Cseri, P. Grosso, J. Marsh, G. Nicol, J. Paoli, D. Schach, H. S. Thompson, and C. Wilson, A Proposal for XSL, submitted to W3C 27 August 1997, http://www.w3.org/TR/NOTE-XSL.
  5. T. Bray, and S. DeRose, Extensible Markup Language (XML): Part 2. Linking, W3C Working Draft 31 July 97, http://www.w3.org/TR/WD-xml-link-970731.
  6. F. A. Salustri, A Formal Theory for Knowledge-Based Product Model Representation, 2nd IFIP WG 5.2 Workshop on Knowledge Intensive CAD, Carnegie-Mellon University, Sep 16-18, 1996; pages 59-78, Chapman & Hall.
  7. V. Akman and M. Surav, Steps toward Formalizing Context, AI Magazine 3(17):55-72, 1996.
  8. T. Bray, D. Hollander, A.Layman, Name Spaces in XML, W3C Note 19 Jan 1998, http://www.w3.org/TR/1998/NOTE-xml-names.
  9. A. Artale, E. Franconi and N. Guarino, Open Problems with Part-Whole Relations Proceedings of 1996 International Workshop on Description Logics, Boston, MA, pages 70-73, Nov 1996.
  10. A. Layman, E. Jung, E. Maler, H. S. Thompson, J. Paoli, J. Tigue, N. H. Mikula, and S. De Rose, XML-Data, W3C Note 05 Jan 1998, http://www.w3.org/TR/1998/NOTE-XML-data-0105.