Subject Term Index
The Subject Term Index is a thesaurus of Subject Terms authorized for use in indexing the records contained in the Encyclopedia of Law. Every legal resource contains at least one Subject Term. You can search or browse the Subject Term Index to identify Subject Terms and find related legal resources.
The structure, centering around concepts and descriptor classes rather than only descriptors and terms. We see a descriptor as a class of concepts and a concept as a class of terms. These classes can then each be represented by a unique numerical identifier that is independent of any term used to name the class and will persist through a change in such a name. This structure makes it possible to store additional useful data as well as storing existing data more efficiently.
Concept. A meaning named by a term.
Descriptor Class. A set of concepts closely related to each other in meaning. For the purposes of indexing and retrieval, these concepts are best lumped together. The traditional entry terms found in thesauri are often not synonymous; considering the concepts named by the terms as members of a larger class allows a more formal representation of the appropriate relationships.
Semantic Type. The basic category or categories of meaning of a concept.
Add data object: concepts, as members of descriptor classes.
- Attach concept attributes to the appropriate concept.
- Terms grouped in concept classes of equivalent terms according to concept.
- Concept attributes attached only to the concept.
Relationships in Medical Subject Headings (MeSH)
Stuart J. Nelson, W. Douglas Johnston, and Betsy L. Humphreys
National Library of Medicine, Bethesda, MD, USA
Recent efforts to make some of the relationships within MeSH more explicit have led to a deeper understanding of the nature of these relationships. This chapter will explore the relationships represented in MeSH in the light of that understanding. Every term that occurs may be thought of as representing a concept. One or more terms, comprising one or more concepts, grouped together for important reasons, form a descriptor class. The descriptor class is the basic building block of the thesaurus. Relationships among concepts can be represented explicitly in the thesaurus, most notably as relationships within the descriptor class. Hierarchical relationships are at the level of the descriptor class. The hierarchies are key in allowing expanded retrievals. The hierarchical relationships, traditionally thought of as broader or narrower (parent-child) relationships, are better understood as representing broader and narrower retrieval sets. Nevertheless, these hierarchical relationships often reflect important broader-narrower relationships between preferred concepts in descriptor classes. Other types of relationships present in the thesaurus include associative relationships, such as the Pharmacologic Actions or see-related cross references, as well as forbidden combination expressions, such as the Entry Combination.
The Medical Subject Headings (MeSH) have been produced by the National Library of Medicine (NLM) since 1960. The MeSH thesaurus is NLM’s controlled vocabulary for subject indexing and searching of journal articles in MEDLINE, and books, journal titles, and non-print materials in NLM’s catalog (Bachrach & Charen, 1978). Translated into many different languages, MeSH is widely used in indexing and cataloging by libraries and other institutions around the world.
Forty years of heavy use have led to a significant expansion in the content of MeSH and to considerable evolution in its structure. It is one of the most highly sophisticated thesauri in existence today. The relationships currently represented in MeSH provide good examples of the types of relationships useful in any thesaurus.
Recently, the maintenance environment for the Medical Subject Headings was redesigned. The altered data structure allows more explicit representation of the relationships, and makes clearer the roles of the relationships and the objects (term, concept, and descriptor class) involved. The new system design has made possible a deeper understanding of these thesaural relationships. It supports the development and maintenance of a rapidly expanding vocabulary. It is hoped that the new data structure will support more advanced technology in information retrieval, such as identifying the appropriate MeSH term within the Metathesaurus of the Unified Medical Language System (UMLS) for such purposes as semi-automated indexing (Schuyler et al., 1993; Nelson et al., 1999). There were other benefits obtained from the redesign process as well, including the development of a mission statement for MeSH. This statement provides a framework from which it is possible to explore the goals and requirements of MeSH. A deeper appreciation of the importance of intention or purpose to the representation of relationships has also emerged.
Understanding MeSH requires an understanding of its structure. There are three major components to the Medical Subject Headings: the Headings themselves, the Subheadings (also known as Qualifiers), and the Supplementary Concept Records. This chapter includes a discussion of the historical MeSH structures, a review of the mission statement of MeSH, and descriptions of the relationships in the MeSH thesaurus discussed in the following order: equivalence relationships, including entry terms and synonyms; hierarchical relationships; and associative relationships.
2. THE HISTORICAL MESH STRUCTURE:
EXPLICIT AND IMPLICIT REPRESENTATIONS
One way to understand the types of relationships within MeSH is to review some of the data elements of each MeSH component. Many of the data elements contain information about relationships, both inter- and intra-descriptor. As is evident in other chapters of this book, it is nonsensical to talk of relationships without talking of meaning; any discussion of the record elements must include how the meaning is represented. The records for the different MeSH components will be discussed in turn, though some of the elements are common to more than one type of record.
2.1 Main Headings
Main headings are the meat of the MeSH thesaurus. They are used to describe what an article or book is “about.” That is, as index terms they provide an indication of the major topics under consideration.
Among the data elements in the record for a main heading are:
- MESH HEADING (MH)
- This is the term used in the MEDLINE database as the indexing term. The term reflects a meaning; its use indicates the topics discussed by the work cited.
- ENTRY TERM, PRINT
- These terms, which are printed in the MeSH publications, are used as pointers to the MH. The presence of an entry term in the record is an indication that this topic should be indexed by the given MH.
- ENTRY TERM, NON-PRINT
- These terms are the entry terms that are not printed. For a variety of reasons, trade names, lab numbers, and some permutations of other terms, either entry terms or the MH, may not be selected for printing.
- RUNNING HEAD, MESH TREE STRUCTURES
- While primarily an indicator of what should appear on the printed page, the running head often provides a general category of where the MH will appear in the hierarchy.
- MESH TREE NUMBER
- The tree numbers indicate the places within the MeSH hierarchies, also known as the Tree Structures, in which the MH appears. Thus, the numbers are the formal computable representation of the hierarchical relationships.
- CAS TYPE 1 NAME
- For chemical names, the structural name assigned by the Chemical Abstracts Service (CAS) in accordance with the American Chemical Society naming convention. This structural name is a synonym for the MH.
- CAS REGISTRY/EC NUMBER
- For chemicals, the CAS registry number assigned to this compound. For enzymes, the Enzyme Commission number provides a classification of the activity of the enzyme. In both of these cases, the number is a formal computable synonym of the Heading.
- RELATED CAS REGISTRY NUMBER
- For chemicals, these are the registry numbers of congeners and derivatives indexed by the same MH.
- SEMANTIC TYPE
- Every term is assigned one or more semantic types (general categories) from the UMLS Semantic Network (McCray & Hole, 1990). These categories help assign meaning to a term.
- CONSIDER ALSO (XREF)
- The notation in this field alerts the user of possible terms with different morphemes (e.g., for kidney, ‘consider also terms at renal’). A user is thus alerted to possible semantic relationships not represented in the usual orthographic representation as seen in a book.
- ENTRY COMBINATION
- This field is present as an indication that a combination of a qualifier and the MH should be indexed by a different MH. It forbids, and in some systems prevents, the establishment of a relationship between a qualifier and a MH, forbidden because the meaning can be expressed in a different manner covered by a different MH. For example, the MH/SH combination ‘Pregnancy/complications’ is forbidden, and the user is pointed to the MH ‘Pregnancy Complications.’
- FORWARD CROSS REFERENCE (SEE ALSO REFERENCE)
- A MH that is closely associated with or should be carefully differentiated from the MH.
- The written instructions given to indexers at NLM for correct use of the MH. It often elucidates meaning as well as usage, and may indicate special relationships with other headings.
- ALLOWABLE TOPICAL QUALIFIERS
- The topical subheadings or qualifiers that may be assigned to the MH by indexers. Some qualifiers which might otherwise be allowed are forbidden and given (with the MH) as Entry Combinations.
- MESH SCOPE NOTE
- This short piece of free text provides a type of definition, in which the meaning of the MH is circumscribed. Other MHs frequently appear in scope notes, usually in ALL CAPS. These represent relationships, which are often very important, but which may not otherwise be represented in the MeSH structure.
- PHARMACOLOGICAL ACTION
- The actions and uses of various chemicals are indicated by a different MH in this field. These relationships are descriptive, often indicating why the particular chemical may be of interest.
- PREVIOUS INDEXING
- The expression here indicates how articles were indexed in MEDLINE prior to the establishment of the MH. This is a historical representation of how the meaning was expressed prior to the advent of the MH.
Historically, subheadings or qualifiers have been of several different types. The ones of most interest are the topical qualifiers. Included in the data elements are those listed below:
- The term that is used in the MEDLINE database to qualify the MH indexing terms. Its use indicates how the meaning of the MH should be refined (i.e., which particular aspect of the topic is addressed in the work cited).
- QUALIFIER TYPE
- The types into which a qualifier can be categorized include topical, form, geographic, and language. These general types determine allowable usage of the qualifier. The topical qualifiers can be thought of as modifiers of the meanings of MHs, while form, geographic, and language qualifiers are essentially descriptive of the citation.
- QUALIFIER CROSS REFERENCE
- An entry term to the subheading.
- MESH SCOPE NOTE
- A type of definition in which characterizes the modification of meanings allowed by the usage of the subheading.
- TREE NODE ALLOWED
- The node annotations indicate which MeSH Tree subcategory(ies) the subheading is usually allowed. The areas or topics that generally relate to the subheading are listed here.
2.3 Supplementary Concepts
Supplementary Concept Records are edited and added to MeSH daily, Preferred names in these records can be assigned to a special data element (Name of Substance) within the MEDLINE record of a citation. As many of the names of the data elements imply, the bulk of these records are related to chemicals and drugs. Data elements include:
- NAME OF SUBSTANCE (NM)
- This name, the preferred name of the substance, is the term used in the MEDLINE database to represent a Supplementary Concept. It can be thought of as analogous to the MH.
- Entry terms to the NM, indicating that the topic should be indexed under the Name of Substance. These are often not true synonyms, but names of substances that are equivalent for retrieval purposes, for example, salts, trade names, and lab numbers.
- CAS TYPE 1 NAME
- For chemical names, the structural name assigned in accordance with the American Chemical Society naming convention. This structural name is a synonym of the NM.
- CAS REGISTRY/EC NUMBER
- For chemicals, the CAS registry number assigned to this compound. For enzymes, the Enzyme Commission number provides a classification of the activity of the enzyme. These numbers are a formal computable representation of the meaning.
- RELATED CAS REGISTRY NUMBER
- For chemicals, these are the registry numbers of congeners and derivatives indexed by the same named substance. These are usually the numbers for the Synonyms discussed above.
- HEADING MAPPED TO (HM)
- When citations are assigned an NM, the one or more MHs in this field are also assigned to the MEDLINE record. There is a structural or functional relationship between the NM and the MHs in this data element.
- INDEXING INFORMATION
- This information consists of MHs associated with the substance which should be considered in indexing or retrieving.
- This note is a free text definition, in which the meaning of the NM is circumscribed.
- PHARMACOLOGICAL ACTION
- The actions and uses of the various chemicals are indicated by the MHs in this field.
- PREVIOUS INDEXING
- The HM which was used to index the citations in the MEDLINE databases before the Supplementary Concept was created.
2.4 The Historical MeSH Structure Representation: A Comment
As with many thesauri, the historical structure of MeSH emphasized the relationships of items within the thesaurus to the main terms. In the database in which the vocabulary was maintained and in databases such as MEDLINE, the Main Heading, the Qualifier, or the Name of Substance was identified as a single term. Relationships were noted and maintained at the term level. The emphasis on term-level relationships led to some imprecisions and loss of ability to represent other relationships. For example, since the late 1980s, entry vocabulary was noted as being related to the Main Heading as synonymous, broader, related, or narrower, but relationships between entry terms were not and could not be noted.
In redesigning the MeSH thesaurus to a concept-oriented structure, it became possible to represent explicitly other important relationships. It became apparent that a Main Heading did not represent a single concept; rather, a Main Heading represented one or more concepts, and constituted a descriptor class (hereinafter referred to as a descriptor.) The important distinction between the meaning of the term used to represent a Main Heading and the usage of that term (to represent the class of concepts entailed by the descriptor) is then more apparent, and the nature of a given hierarchical relationship in a thesaurus is clarified. These distinctions then make it possible to appreciate the differences between a thesaurus and a concept representation scheme.
3. THE GOALS OF MESH:
LOGICAL CONSTRAINTS ON THESAURUS CONSTRUCTION
In defining the role of the Medical Subject Headings, it seemed appropriate to develop a formal statement of the goal of MeSH. It is “to provide a reproducible partition of concepts relevant to biomedicine for purposes of organization of medical knowledge and information.”
This statement bears close examination, as many of the words selected for it have deep meanings. In order for MeSH to provide a reproducible partition of concepts, the headings must be approachable, make meaningful distinctions, be scientifically valid and current, and reflect a consistent approach. To provide a partition implies that the knowledge space must be covered in its entirety, without multiple ways of expressing the same ideas. That the headings partition concepts reflects the reality that not every concept we might wish to express is sufficiently distinct in its meaning that it would serve well as a Main Heading. That MeSH must cover all ideas relevant to biomedicine simply reflects the fact that many ideas not central to biomedicine might nevertheless be of interest. That MeSH’s role is one of organization is not a surprising claim, but serves to emphasize that it is not solely for indexing or for cataloging, but also to support retrieval. The emphasis on knowledge and information accentuates the idea that the role is not one of characterizing data such as might be seen in a clinical or research environment, but material at a higher level of intellectual organization.
3.1 Meaningful Distinctions
Main Headings should be distinct in meaning (as well as spelling) from other Main Headings in the thesaurus; that is, they should not overlap in meaning. The constraint of distinctness, necessary to achieve a partition, can be easily defined in the MeSH thesaurus. Since descriptors consist of classes of concepts, distinct descriptors are just those with concepts whose meaning does not overlap that of a concept in any other descriptor, and whose application will achieve a partition of the literature.
The one acceptable exception to this rule is for descriptors in a hierarchical relationship. Broader descriptors whose meaning encompasses the meaning of their descendants are usual. If two descriptors are in a hierarchical arrangement, the common NLM indexing convention is to assign the most specific descriptor that covers the topic discussed in an article.
However, if the overlapping or duplicate descriptors are not in a hierarchical arrangement, the indexer or searcher would have no way to determine which descriptor to use. Under those circumstances, indexing would necessarily be inconsistent and searching arbitrary. For example, the component concepts of the MeSH descriptor ‘Exercise’, ‘Isometric Exercise,’ and ‘Aerobic Exercise’ overlap in meaning. They are not sufficiently distinct in meaning to be useful in this thesaurus.
In other cases, the descriptor itself may be clear but its application would not be. For example, we might wish to make a distinction between ‘DNA Fingerprints’ and ‘DNA Fingerprinting.’ The meanings are distinct, but the literature is not. Discussions do not clearly distinguish between the process and the product. By way of contrast, we can note that ‘Radiography’ and ‘Radiographs’ are sufficiently distinct in the literature to warrant making them separate descriptors.
To be approachable, the thesaurus must be organized in a clear and intuitive manner. Names of descriptors need to reflect the broad meaning of the concepts involved. The hierarchical relationships must be intellectually accessible to users of MeSH (e.g., clinician, librarian, and indexer). An indexer must be able to assign a given Main Heading to an article and a clinician must be able to find a given Main Heading in the tree hierarchy. Consistency in style, both in naming and in arranging the hierarchical relationships, are important aspects of this approachability. Which relationships are and are not represented in a hierarchy becomes a significant issue for the thesaurus developer. It is not possible to develop rules that are applicable in all cases. Nevertheless, the goal of a principled approach to developing hierarchies remains an ideal.
3.3 Scientific Validity and Currency
Thesaurus terminology must not only be those terms currently used in the documents indexed by the MeSH vocabulary (e.g., journal articles), it must also be consistent with currently accepted scientific theories and results in the corresponding biomedical field. There is an essential tension between the currency of a representation and its recognized validity. Today’s hypothesis or theory may easily be disproven; on the other hand, it may become a valid standard. Only time and further experience, both enemies of currency, will tell.
4. EQUIVALENCE AND SUBSTITUTIONARY RELATIONSHIPS
ENTRY VOCABULARY AND THE DESCRIPTOR CLASS
The relationship of entry terms to main headings is one of the most essential in any thesaurus. Traditionally, entry vocabulary has been thought of as synonyms and quasi-synonyms of the main heading (quasi-synonyms are nonsynonymous terms which are otherwise indicative of the same concept, e.g., “dryness” would be a quasi-synonym of “wetness.”) (National Information Standards Organization, 1994). Soergel (1985, p.. 219) refers to these relationships as “equivalency”, though we would prefer to reserve that term as describing, in a mathematical sense, the essential relationship of synonymy. Moreover, these equivalence relationships are not the only ones appropriate in entry vocabulary. In MeSH, the presence of an entry term can be interpreted as an indicator that the Main Heading is the appropriate method of representing that specific meaning for the purpose of organizing literature. In an environment in which MeSH is used, the Main Heading could be used as an appropriate substitute for the entry term. This type of relationship can thus be thought of as substitutionary.
As noted in 2.4, the primary entities in MeSH have been the Main Heading and the term. In the process of designing a new MeSH maintenance system, it was desirable that a third entity, the concept, should be explicitly represented. In the development of the new system, the nature of Main Headings, as a cluster of one or more concepts, became clearer. In the new maintenance system, the identification of synonymy between entry terms is made by explicit reference to concepts. The new system can store the fact that ‘Laser Scalpel’ and ‘Laser Knife’ are synonyms of each other and entry terms to ‘Laser Surgery’. The new system formally represents that each term names the same concept.
The distinction between a descriptor, term, and concept in a single thesaurus is not new to the literature, but it has not been fully developed or exploited. As Soergel says, “One must carefully distinguish between the plane of concepts and the plane of terms…, lest confusion reign. The relationships between concepts and terms … are muddy at best” (1985, pp. 217-218). The three-level structure in MeSH (descriptor class, concept, and term) helps to make these relationship less “muddy.”
4.1 What Is a Concept?
The ordinary definition of a concept is the common idea or meaning expressed by synonymous words or terms. For example, the terms ‘Cardiac Arrest’ and ‘Heart Arrest’ express the same concept. This notion of a concept is not one of a novel or abstruse entity; indeed, a concept may be identified by the class of synonymous terms. For computational purposes, a precise way of identifying synonyms is all terms that are members of the same concept class. Synonyms constitute a true equivalence class. They may be substituted for one another in an arbitrary expression without changing the meaning.
4.1.1 Relationship Between Concepts and Terms
The thesaurus literature has recognized the existence of concepts as distinct from terms in statements such as the one that the broader_than (BT) and narrower_than (NT) relationships are “really” between concepts and only between terms by derivation (Maniez, 1988, p. 220). Part of the reason for the confusion between concepts and terms is that it is necessarily difficult to talk of a concept without using at least one term. Often a concept has a preferred term, which is typically used to refer to the concept and may be viewed as the name of the concept (Soergel, 1985, p .218; National Information Standards Organization, 1994, p. 2). However, since the identity of the preferred term may change, thesauri often use an arbitrary alphanumeric or numeric string as a Unique Identifier (UI), the persistent name of a concept. A term may change its meaning (think of how “gay” has changed in meaning over the past 50 years), but the UI should remain with the meaning to which it was originally assigned.
4.1.2 Relationship Between a Concept and Descriptor: The Descriptor Class
As noted earlier, the historical view of a descriptor in MeSH was as a group of terms, with the preferred term being the name of the Main Heading. Now, since a descriptor consists of one or more concepts, how is the name of the Main Heading or descriptor chosen? One term in each concept set is the preferred term for that concept. Analogously, one of the concepts in a descriptor class is the preferred concept; the name of that descriptor then is the preferred term of the preferred concept. For example:
- Descriptor ‘ Coronary Disease
- Concept 1 ‘ Coronary Disease
- Concept 2 ‘ Coronary Occlusion
- Concept 3 ‘ Coronary Stenosis
Concept 1 is the preferred concept, with the other two concepts being subordinate. Each of the terms shown is the preferred term for each concept. (Other terms are not shown.) In general, the preferred concept would be the one somehow “broadest” in meaning; however, this broader relationship is poorly defined. Recognizability by a user is important, as well as the possibility that the user might look for the less common concepts under this name.
Since this role as preferred term of preferred concept may change, MeSH uses an arbitrary alphanumeric string, the Unique Identifier (UI), as the unchanging name of each descriptor. The descriptor UI (the D number) is not the same as the Concept UI (an M number.) Using the UIs permits the representation to remain clear and the attributes of various objects to be linked to the appropriate object.
The new MeSH maintenance system attaches to a descriptor only those properties or fields that are appropriate to it. The list of allowable qualifiers is an attribute of the descriptor. Importantly, as discussed in Section 5, the position in a hierarchy is an attribute of the descriptor. On the other hand, the scope note or definition is an attribute of a single concept. Multiple definitions, one for each concept, can then be represented within the descriptor record.
The question may arise if a concept could be a member of more than one descriptor class. Allowing a concept to be in more than one descriptor class would destroy any hope of achieving a partition of the information space. MeSH does not permit membership in multiple descriptors. The polyhierarchical arrangement of descriptors is sufficient to support the goals of MeSH.
4.2 Relationship of Substitutionary Equivalence: Descriptors and Entry Terms
The Main Heading and entry terms are the names of the concepts in a descriptor class. In most bibliographic databases, and in MEDLINE in particular, the descriptor name (the preferred term of the preferred concept) is the one assigned and attached to the citation record. An entry term may be a synonym of the descriptor name, or it may be a name of an additional concept in the descriptor. The relationship between the entry term and the descriptor name can be thought of one as substitutionary equivalence. The terms function equivalently in retrieval systems that employ MeSH as it was designed to be used.
This relationship is not as restrictive as that of synonymy, though it may be mistaken for synonymy. While entry terms are closely related to Main Headings they need not be synonymous to the Main Heading or to each other. This practice is justified because it is often impractical and usually undesirable to restrict the entry vocabulary to strictly synonymous terms. Most indexed domains have limited scope, which may have few or no documents about a given group of synonyms. Retrieval in the domain does not require fine granularity for every term. For example, MeSH contains a descriptor for ‘Whales’ but the domain of MeSH is biomedicine and not zoology. In the MEDLINE citation database, there are not sufficient citations to create a separate descriptor for each specific whale species. Nevertheless, it is useful to have the species names as entry terms to the descriptor. Gains in precision of retrieval by creating more specific descriptors would be small.
The guiding principle in the development of entry vocabulary is the degree to which the distinction between the entry terms and the preferred term become important in conceptually partitioning the literature. If the body of literature is hopelessly fragmented by overly fine distinctions or if meaningful distinctions cannot be made, the utility of MeSH is reduced.
4.3 Relationships Between Entry Terms and the Preferred Term
Most thesauri do not make explicit the relationships between the entry term and the preferred term except to note that they are equivalent to each other (and to indicate the preferred term) for purposes of indexing and retrieval (Soergel, 1985, p. 218; National Information Standards Organization, 1994, p. 15). For some time MeSH has gone beyond this and labeled some relations within descriptors, primarily the relationship of each entry term to the preferred term. For example, ‘Laser Microsurgery’ is an entry term to ‘Laser Surgery’ and we have labelled this relationship as Narrower. Other entry term relationships are labeled as Broader or Synonymous.
Relationships within descriptors are not identical to the relations between descriptors, also often referred to as BT and NT, that are represented in thesaurus hierarchies such as the MeSH Tree Structures. The relationship of belonging to the same descriptor class is somehow closer. The nature of the relationship is readily apparent: the preferred term is the descriptor name used in the database to represent the meanings named by the entry term. This is a relationship of substitutionary equivalence.
5. HIERARCHICAL RELATIONSHIPS:
TREES, SUBSUMPTION, AND MESH IN DOCUMENT RETRIEVAL
The hierarchical relationships are fundamental components in a thesaurus and can be powerful tools for retrieval. MeSH has long formalized its hierarchical structure in an extensive tree structure, currently at nine levels, representing increasing levels of specificity. This structure enables browsing for the appropriately specific descriptor, and is the basis of automatic and very powerful searching of all more specific topics.
The hierarchical relationships in the MeSH trees reflect a number of aspects of the main heading, both definitional and intentional. However, the relationships represented in the traditional hierarchies were implicit and not well defined. The relationship was termed a “broader than” relationship. Without a more formal definition of this relationship, such a statement is almost meaningless. “Broader than” relationships can be as formal as a set of meronymic and hyper/hyponymic relationships, for example, the set composed of “is_a,” “part_of,” “conceptual_part_of,” and “process_of” (see National Information Standards Organization, 1994, pp.17-19). A thesaurus using only these relationships would approximate a concept representation scheme. While subsumption is often an important part of the arrangement of the hierarchies, other needs come into play with MeSH.
Many examples of hierarchical relations are instances of the part/whole and class/subclass relationships, which are relatively well understood, not only theoretically but also in their application. However, these relationships do not capture all hierarchical relationships and may not well explain any of them. A classic example is the case of ‘Accidents,’ which has as a narrower term, ‘Accident Prevention.’ Clearly, ‘Accident Prevention’ is not part of ‘Accidents’ nor is it included in the class of ‘Accidents.’ Soergel gives the following “hierarchy test” for retrieval of documents: “Should a search for documents dealing with A find all (or most) documents dealing with B? If yes, A is broader than B (and conversely, B is narrower than A)” (1985, p. 252). Given this test, it is easy to see that the BT/NT relationships thus represented are at the level of the descriptor. The Soergel definition defines the nature of this relationship.
Moreover, a search for documents about accidents in MEDLINE should find documents about accident prevention. It is the relationship of “aboutness” that is fundamental to a hierarchy in a thesaurus used for document retrieval. The relationships of part/whole and class/subclass have a role in subject retrieval because if a subject is about a subclass, then it is also likely about the class. Further analysis of this notion of subject or aboutness might provide us with rules for assigning if not using a hierarchy in document retrieval (Maron, 1977; Harper, 1989).
Arranging material hierarchically with these criteria should result in the placement of a descriptor in more than one hierarchy. This is a perfectly reasonable thing to do. What is less obvious is that the narrower descriptors (“children”) of a given descriptor do not need to be identical in different trees. For example, consider the descriptor ‘Nose.’ As a child of ‘Sense Organs,’ ‘Nose’ should appropriately have the children ‘Vomeronasal Organ’ and ‘Olfactory Bulb’; however, as a child of ‘Face,’ its children should include ‘Nasal Bones.’ A search based on the descriptor ‘Nose’ in the context of the ‘Sense Organs’ hierarchy should retrieve citations about the olfactory bulb, but not those about nasal bones.
Many individuals have tried to use MeSH as a concept representation language with only modest success. That the relationships in the MeSH tree structures were designed with a different view, and with a different (and not formal) meaning of “broader_than,” has frustrated their efforts. The MeSH hierarchical structure was designed to reflect a view of the literature for a user. Articles are indexed with the most specific headings available. The use by a searcher of a parent heading would allow a broader search. In the ELHILL system, this was known as an “explode,” and required the searcher to indicate that they wished to include all the more specific terms. The trees thus indicate what appears to be a useful set of relationships, based on the perceived needs of searchers.
Since its hierarchical relationships are between descriptors, there are practical reasons for a MeSH descriptor to have different children in different trees. The MeSH hierarchy differs substantially from a concept representation language. Hierarchical relationships in concept representation are at the level of the concept; in the MeSH thesaurus, these relationships are at the level of the descriptor.
6. ASSOCIATIVE RELATIONSHIPS:
THE “SEE RELATED” CROSS-REFERENCE AND OTHER ATTRIBUTES
Associative relationships have often been thought to be similar to hierarchical relationships, but somehow looser and less clearly defined. These are often relationships for which inclusion in an “explode” type of search is not mandatory, but should be considered under certain circumstances. Another use for the associative relationship is to point out in the thesaurus, the existence of other descriptors, which may be more appropriate for a particular purpose. They may point out distinctions made in the thesaurus or in the way the thesaurus has arranged descriptors hierarchically.
Many associative relationships are represented by the “see related” cross reference. The categories of relationships seem to be greater in number and are certainly more varied than hierarchical relationships. (See Harper’s 1989 analysis of the associative relationships in MeSH.) While some of these relationships may have definitional value, the most that can be said of them for certain is that they serve as important reminders of the impossibility of representing all the important relationships between ideas in a set of hierarchies.
One attribute which can be thought of as an associative relationship within the MeSH thesaurus is the Pharmacologic Action. Limited to chemicals (which comprise approximately half of the Main Headings in MeSH), this relationship allows the aggregation of chemicals by actions or uses. Two chemicals with the same Pharmacologic Action thus share an association by virtue of sharing the same property.
An Entry Combination, which points or maps users to a different descriptor rather than allowing them to use a combination of a descriptor with a specific qualifier, is a different type of associative relationship. These relationships attempt to maintain the distinctions between usages of descriptors where a meaning represented by the descriptor/qualifier combination may overlap the meaning of a different descriptor.
The transition of MeSH from a term-based structure into a concept-oriented structure has shed some light on important relationships within a thesaurus. A descriptor represents one or more concepts. The role of entry vocabulary is to provide a guide for the proper choice of a descriptor. Avoiding overlap in meaning between descriptors emerges as another motivation for the use of entry vocabulary, as well as the associative relationships provided by the Entry Combination. The hierarchical relationships are of primary importance as keys to literature organization of a body of literature. The “broader than/narrower than” relationship in thesaurus hierarchies is one of representing a broader or narrower retrieval. The differences between a document-retrieval thesaurus and a concept-representation language become apparent.
Bachrach, C. A. & Charen, T. (1978). Selection of MEDLINE contents, the development of its thesaurus, and the indexing process. Medical Informatics, 3, 237-254.
Harper, C. R. (1989). Associative relationships in the MeSH thesaurus. Associate Project Report. National Library of Medicine.
Maniez J. (1988). Relationships in thesauri: Some critical remarks. International Classification, 15, 133-138.
Maron, M. E. (1977). On indexing, retrieval, and the meaning of about. Journal of the American Society for Information Science, 28, 38-43.
McCray, A. T. & Hole, W. T. (1990). The scope and structure of the UMLS semantic network. In Miller, R. A. (Ed.), Proceedings of the Fourteenth Annual Symposium on Computer Applications to Medical Care, pp. 126-130. New York: IEEE Computer Society.
National Information Standards Organization. (1994). Guidelines for the construction, format, and management of monolingual thesauri. Bethesda, MD: NISO Press. (ANSI/NISO Z39.19-1993).
Nelson, S. J., Aronson, A. R., Doszkocs, T. E., Wilbur, W. J., Bodenreider, O., Chang, H. F., Mork, J., & McCray, A. T. (1999). Automated assignment of Medical Subject Headings. Journal of the American Medical Informatics Association, Symposium Supplement, 1127.
Schuyler, P. L., Hole, W. T., Tuttle, M. S., & Sherertz, D. D. (1993). The UMLS metathesaurus: Representing different views of biomedical concepts. Bulletin of the Medical Library Association,81, 217-222.
Soergel, D. (1985). Organizing information. Orlando, FL: Academic Press.