What is a Thesaurus
A thesaurus is a structured list of the expressions used, on the one hand, to represent the document’s contents (indexation) and on the other hand, for searching these documents in a documentary system.
According to Noguera-Iso et al (2005) a thesaurus is an organised collection of terms enriched with relations that are linked to one another by cross-referencing to the database. A term is a word (or expression) that represents a conceptual category and for that reason a thesaurus can be seen as a structured and controlled vocabulary of words either represented in monolingual or multilingual format.
The adoption of such semantic tools has the benefit of ensuring that a subject will be described using the same preferred term each time it is indexed and this will make it easier to find all information about a specific topic during the search process. This means that using a thesaurus improves search results due to the inclusion of information about the relationships of words or expressions that represent standardised relationship indicators. The thesaurus specifies which word or expression will be used as the preferred (or authorised) term for a particular concept and then connects the non-preferred terms or synonyms to it. The purpose is to provide a wide range of paths to reach those terms. Ultimately, it works as a classification tool to ease the communication process for indexers and information seekers that need to speak a common language.
Thesauri have been an important tool in Information Retrieval for decades and still are. They have the potential to greatly improve the information management of large organizations, but they are still underused in content management systems, search engines or tagging systems.
Thesauri can be used to support various application scenarios like autocomplete, facetted search and browsing, recommendations or glossaries. Herein thesauri usually perform the function of harmonizing terminologies, controlling vocabularies and/or support the user in browsing through a concept space.
The distinction between vocabulary and thesaurus it has been addressed.
BS-8723-1, for example, defined a controlled vocabulary as a prescribed list of terms or headings each one having an assigned meaning [noting that “Controlled vocabularies are designed for use in classifying or indexing documents and for searching them.”] and a thesaurus as a
Controlled vocabulary in which concepts are represented by preferred terms, formally organised so that paradigmatic relationships between the concepts are made explicit, and the preferred terms are accompanied by lead-in entries for synonyms or quasisynonyms. (BS-8723-1, sect. 2.39)
About this Guide
This Guide is intended for users of the thesaurus. It mainly serves as a guide on how to work with the Legal Thesaurus, aiming to be useful especially for a non-technical audience. You can also find some basics on concepts like thesauri and technologies like SKOS.
Concepts, Top Concepts and Concept Schemes
Concept schemes: A concept scheme can hold multiple top concepts and concepts.
Top-Concepts: The concepts on the first level below a concept scheme are called top-concepts, which in turn can hold other concepts. Each concept can have children so called sub-concepts.
Thesauri with SKOS-based applications
Since its first release in 2004 the W3C recommendation SKOS (Simple Knowledge Organization System) has been utilized by several Semantic Web applications as a lightweight model to support interoperability at the terminological and schematic level (See: Avesani, 2005; Kules, 2006; Sah, 2007; Abel, 2008; Davies, 2008; Tordai, 2009; Golub, 2009; Echarte, 2009). “SKOS provides a vocabulary to define the basic structure and content of semi-formal knowledge organizations such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies and other similar controlled vocabularies. Since it is designed on RDF, SKOS allows these semi-structured concepts to be published on the Web, linked to data available on the Web and also incorporated with other concept schemes.” (Sacco, 2010). Its comparably low ontological (semantic) complexity makes SKOS an ideal standard to be utilized for collaborative knowledge organization purposes especially within the context of socially generated classification schemes (See: Orlandi, 2010; Waitelonis, 2010; Sah, 2010). With the Linked Data initiative gaining momentum in the past years, SKOS (Simple Knowledge Organization System) has emerged as a common ‘standard’ (currently a W3C recommendation) for expressing knowledge organization systems (KOS) such as thesauri or taxonomies. SKOS features a concept-oriented approach, with a concept being “An idea or notion; a unit of thought.” (as defined in the SKOS definition itself) that can be represented with an URI. A term-oriented approach proposed e.g., in ISO2788 and ISO5964 and other older thesaurus standards on the other hand treats lexical entries (terms) as the most basic units. A detailed comparison of the two approaches can be found in the appendix of the SKOS Primer (Isaac, 2008). Most term-based standards (ISO 2788 1986, ISO 5964 1985) were developed in a pre-web era and are revised at the moment in the upcoming new ISO standard (ISO 25964-1) where the term-oriented approach has also been changed towards a concept based approach: “The traditional aim of a thesaurus is to guide the indexer and the searcher to choose the same term for the same concept… The concepts are represented by terms, and for each concept, one of the possible representations is selected as the preferred term…” (Sacco, 2010).
Another sign for the importance of having controlled vocabularies in web-oriented formats like SKOS is that more and more existing vocabularies are offering SKOS versions of their vocabularies beside the classic formats provided up until now. Transformations have been made for thesauri like Agrovoc (Morshed, 2010), Eurovoc (Rodríguez, 2008), GEMET (Miles, 2004) and STW Thesaurus for Economic (Neubert, 2009) but also for other types of controlled vocabularies like subject headings.
Thesaurus-based application scenarios. An overview
“Today’s thesauri are mostly electronic tools, having moved on from the paper-based era when thesaurus standards were first developed. They are built and maintained with the support of software and need to integrate with other software, such as search engines and content management systems. […] Whereas in the past thesauri were designed for information professionals trained in indexing and searching, today there is a demand for vocabularies that untrained users will find to be intuitive, and for vocabularies that enable inferencing by machines.” (ISO 25964-1, 2011). This introduction to the new ISO standard already states that scope and use of thesauri has changed since the shift from a pre-web to the web era. Broughton (2006) states that the main application scenarios for thesauri are indexing, providing metadata, search (query formulation and expansion) and browsing and navigation. Soergel (2002) defines use cases for thesauri in the context of digital libraries. According to his notion they support learning and assimilating information, assist researchers and practitioners with problem clarification, support information retrieval (support searching, meaningful information display, indexing, facilitate combination/access of multiple databases) and support document processing after retrieval. For Semantic Web applications especially the information retrieval and document processing aspects on top of controlled vocabularies are of great importance. With reference to the use cases proposed by Soergel (2002) and Broughton (2006) we provide a short description of the practical applicability of thesauri for the following application scenarios.
Filtering / Classifying
From an end-user perspective, the ability to browse classifications is useful to get an impression about the level of detail and type of the data held in an information system. Thus it enables a user to search the system even if no search terms should or can be formulated. This use case has been identified as “browsing the classification structure” in Soergel (2002). Also Broughton (2006) states that “the thesaurus is often used as an aid to navigation or browsing through the systematic display”. These are examples using the “classification structure” from a user’s perspective but this structure can also be used to classify documents automatically or pro vide a filter structure (facets) for narrowing down search results.
A thesaurus can be used to filter, browse or classify content by categories. As learning curves for complex classifications are steep, a static hierarchy with a defined scope (limited number of concepts) is preferable compared to a dynamic one. Hence the quantity of valid concepts and labels is restricted by the application. Equivalence relations are relevant for categorization as they increase the semantic consistency of a thesaurus, while polyhierarchies and homonyms should be avoided as they increase complexity. The hierarchical depth is restricted by the application. Depending on the completeness of the vocabulary, additional information to the subjects might be presented as part of a glossary (see below). Associative relations, definitions and notes are not relevant for classification purposes.
A thesaurus can improve standard indexing functionalities for documents (statistical or linguistic) by providing domain knowledge for the extraction resulting in better indexing results. The higher the domain specificity of a thesaurus, the better the indexing results will be. Hence the number of concepts and labels within a thesaurus is restricted by the scope of the domain. Equivalence relations are highly relevant for indexing documents as they increase the lexical explorativity of a document corpus, while the relevance of hierarchical and associative relations is not relevant for indexing purposes as they mainly play a role for retrieval of indexed content objects which is covered in our case in the recommendation scenario (see below). Indexing will go hand in hand with statistical and linguistic approaches for extracting terms. This can also support a semi-automatic thesaurus maintenance approach providing new terms by determining frequently extracted terms not found in the thesaurus and suggesting them as new concepts.
Broughton (2006) states: “As it was first developed, the thesaurus was an indexing tool for large technical document collections.” Indexing is maybe the most common application scenario for thesauri. A definition is given e.g., in NISO/ANSI Z39.19: “Indexing is the process of assigning preferred terms or headings to describe the concepts and other metadata associated with a content object.” (ANSI/NISO Z39.19, 2005).
Query Formulation / Expansion
Moderated search provides knowledge-based support for end-users when vertically exploring a domain and/or when exercising federated search. This application has proven especially useful when combined with free-text search to enable structured browsing and formulating complex queries. Broughton (2006) distinguishes between query formulation and expansion, where the first means that additional search terms are provided to the user from the thesaurus and can be added to the search in the frontend while the second means that the search is enriched by the structure of the thesaurus automatically by the search engine without user interaction.
A thesaurus as a search tool supports query formulation and query expansion. Query terms can be widened, narrowed or translated based on the terminological pool of the thesaurus and the corresponding semantic relations. In a moderated search alternative labels (equivalence relations) and related concepts (associative relations) are used to expand the search query. While equivalence relations are well suited to define the lexical entry point into a knowledge model, associative relations help to broaden the context, in which a search query demands validity. Hierarchical relations may also be used to show alternative search terms within a given context (path dependence) but are generally of minor importance for the query construction. For better navigation, results can be sorted according to their classification or filtered according to defined facets as a result of a previous classification (see above).
A thesaurus can provide recommendations that could improve retrieval of indexed content, autocomplete suggestions or query formulation/expansion (see above) by using the domain knowledge built in the thesaurus via relations. All relation types are relevant for providing recommendation but especially associative relations and hierarchical relations play an important role because they could be used to suggest alternative search queries or help to retrieve content that is not directly related to the search terms but related to the subject of the search (e.g., using broader or sibling terms in a hierarchy) or related to the scope of the search (e.g., using related terms).
When browsing (or searching) an information system, recommended items help to broaden the user’s view on the contained data. Often search terms are badly formulated or the existing structure doesn’t fit the browsing needs of the user. A thesaurus can provide those recommendations via the knowledge model that is built around its concepts using synonyms and relations to recommend content or expand queries. Burke (2000) provides examples and experiments with knowledge-based recommendation systems. We could not find any direct relations to the use of thesauri for recommendation but we think that this application scenario is implicitly covered by (or at least an extension to) the scenarios mentioned before.
Glossaries support the users of an information system in interpretation of the contained data. They can be a starting point to access/learn about a domain and also a reference point where a domain or the concepts of a domain are defined. Soergel (2002) defines “Support learning and assimilating information” and “Support meaningful information display” as functions of a thesaurus and we regard glossaries as the right tool to fulfill these functions.
Glossaries can be beneficial for the user in various ways. Since the aim to completely describe the concepts of a domain all structural elements defined are relevant. A Glossary should provide a consistent and complete overview of a domain and by that could serve as a knowledge base or agreed reference of terminology for that domain. This implies also the need to clarify the meaning of concepts defined in a thesaurus by means of providing definitions, examples and scope notes. In this context, a thesaurus-based glossary can be seen as a source of metadata that can be, for example, used to provide context-sensitive help in information systems.
Linked Open Data
In the world of Semantic Web, the reference to link digital resources with the ones already available online is the Linked Open Data (LOD). Linked Open Data may be:
· A technology to combine the many pieces of information we get from data providers.
· A way to share that data with other parties.
· A way to give users the best possible search experience.
Linked Open Data addresses a set of rules, tools and recommendations to the content providers. All the data in the Encyclopedia has to be named and linked.
The Legal Thesaurus
The Legal Thesaurus ontology is an extension of SKOS (Simple Knowledge Organization System) – W3C recommendation, including appendix B, SKOS eXtension for Labels (SKOS-XL).
The Legal thesaurus is a subclass of the SKOS “Concept Scheme” class.
Namespaces used in the ontology, and prefixes used in this document
Dublin Core dc= https://purl.org/dc/elements/1.1/
Legal Thesaurus Schema = https://legalthesaurus.org/schema/
OWL owl= https://www.w3.org/2002/07/owl#
RDF rdf= https://www.w3.org/1999/02/22-rdf-syntax-ns#
RDF Schema rdfs= https://www.w3.org/2000/01/rdf-schema#
SKOS skos= https://www.w3.org/2004/02/skos/core#
SKOS-xl xl= https://www.w3.org/2008/05/skos-xl#
XML Schema xsd= https://www.w3.org/2001/XMLSchema#
Used Schema Annotations
Legal Thesaurus resources have been modelled as direct extensions of the SKOS and SKOS-XL classes and properties. Some Dublin Core properties have been used.
Legal Thesaurus Areas and microthesauri
EuroVoc is split into several areas and several microthesauri. Each area is divided into a number of microthesauri. A microthesaurus is considered as a concept scheme with a subset of the concepts that are part of the complete Legal thesaurus.
As SKOS has no rules to define a hierarchy of concept schemes, several concept collections has been used.
Microthesauri are represented as instances of iso-thes:microThesaurus, a subproperty of skos:inScheme with rdfs:subClass of skos:ConceptScheme.
Any micro thesaurus belongs to exactly 1 domain.
The name of each Area uses skos:prefLabel, with one label per language.
The area identifier is represented using the Dublin Core property “dc:identifier” and is part of the area name.
The microthesaurus identifier is represented by the property dc:identifier and is part of the microthesaurus name.
The property “skos:hasTopconcept” defines the top thesaurus concepts in a microthesaurus, with no broader relationships.
In true generic hierarchies, thematically related concepts are often scattered across many branches of the hierarchy. Associative relationships allow for lateral connection of concepts, but are rarely suitable for synoptic views of thematically related concepts. Concept groups allow for compiling concepts from different facets and hierarchies under a common name.
Neither membership in a concept group, nor the nesting of concept groups can be equated to a BT/NT relationship.
“Many thesauri group concepts using a classification structure that exists in parallel to the hierarchies of thesaurus concepts based on BT/NT relationships. Groups created by the classification are often based on disciplines, subject areas or areas of business activity. They are sometimes called “subject categories”, “themes”, “domains”, “groups”, “subsets” or “microthesauri”. The model provides for all of these by providing the classes ConceptGroup and ConceptGroupLabel and the specific type may be indicated by the attribute conceptGroupType. In general, there is not a BT/NT relationship between a ConceptGroup and the concepts that it contains.” [ISO 25964-1, 15.2.18]
The class ConceptGroup is mapped to skos:Collection.
The Legal Thesaurus offers a classified structure of what the ISO standard defines as
concept groups (subject fields or subject categories) for grouping thesaurus concepts by themes. The Legal Thesaurus assigns concepts to Concept Groups. Note that concepts can be assigned to multiple categories. The categories are nested and thus form a chain of superordinate and subordinate categories.
Subject categories can be represented as concept groups as defined by the standard. The concept groups are nested by a hasSubgroup/hasSupergroup relationship in the ISO model.
The hasSubgroup relationship can be mapped to skos:member.
Legal Thesaurus (Concept Scheme)
International Law (Concept Group)
Public International Law (Concept Group)
In this example, “International Law” is not a concept scheme because the members of this group shall be no used independently from the entire thesaurus.
It is possible to organise “Public International Law” theme in sub-themes by creating nested concept groups.
Thesaurus standards distinguish between two basic relationships between concepts: hierarchical and associative. According to ISO 25964-1, hierarchical relationships hold “between a pair of concepts when the scope of one of them falls completely within the scope of the other. It should be based on degrees or levels of superordination and subordination, where the superordinate concept represents a class or whole, and subordinate concepts refer to its members or parts” [10.2.1]. Fat milk is a kind of milk and that in turn is a dairy product. Associative relationships cover “associations between pairs of concepts that are not related hierarchically, but are semantically or conceptually associated to such an extent that the link between them needs to be made explicit in the thesaurus, on the grounds that it may suggest additional or alternative terms for use in indexing or retrieval” [10.3.1].
The ISO standard prevents relationship combinations such as concepts that are connected by more than one of the basic relationships: Concept A must not be linked to concept B by a hierachical and an associative relationship.
Assocative relationships between sibling terms are not inadmissable to the ISO standard.
The model of the ISO standard provides scope notes for concepts, though the standard does not mandate for the usage of scope notes or definitions. However, scope notes and/or definitions are crucial for the clarification of the intended meaning in almost all cases.
ISO 25964-1 models a class ThesaurusArray to group concepts which are hierarchically linked. A thesaurus array is indicated by a node label showing how the concepts have been arranged. A node Label “contains one of two different types of information: a) the name of a facet to which following terms belong; or
b)the attribute or characteristic of division by which an array of sibling concepts has been sorted or grouped.” [ISO 25964-1, 2.38]
The class ThesaurusArray is mapped to skos:Collection.
ISO 25964-1 addresses three cases in which node labels can be used to display facets or sub-facets:
1. They label a facet as the top of a hierarchy.
2. They are inserted in a hierarchy to introduce a new facet by which the subordinate concepts are arranged.
3. They are inserted as node labels to elicit the characteristic of division by which sibling concepts (member of array) are grouped.
All subordinate concepts are narrower concepts of the superordinate concept. The model of the ISO standard does not explicitly distinguish between these different types of node labels. However, it is important to consider that a true hierarchical relationship holds between sibling concepts grouped by a characteristic of division and its superordinate concept.
In contrast, this is not true for concepts which are grouped by node labels showing facet names. Thus it should be taken into consideration to introduce a type distinction for ThesaurusArray. Distinguishing types of node labels would allow for different views. In some cases it can be useful to omit node labels showing characteristics of division.
The ISO model defines a hasTopConcept relationship between thesaurus concepts and its inferred topmost concept of the hierarchy.
The SKOS model, instead, defines a hasTopConcept relationship between a concept scheme and a concept, thus allowing arbitrary assertions about top concepts. Providing skos:hasTopConcept as navigation aids should, therefore, not be considered a necessary quality criterion. Entries to facilitate browsing can be provided by concept groups (skos:Collection).
“Micro-Thesaurus“, like “ThesaurusArray“, group concepts. “Micro-Thesaurus“ is not mapped to skos:ConceptScheme, because if each of the microthesauri were represented as skos:ConceptScheme, semantically connected concepts would be scattered across many different schemes.
According to the standard, “Group of terms” and “Micro-Thesaurus” would each be modelled as ConceptGroup, the latter of type “microthesaurus”.
Duplicate control is essential to avoid using the same term for two different concepts.
The ISO standard recommends that no duplicate terms should be entered for the same language, whether a preferred or a non-preferred term. Thus, a qualifier should be added to each homographic term.
“Homographs (sometimes referred to by the broader term “homonyms”) are words with the same spelling but
different meanings. […] When homographs are needed as thesaurus terms, the meaning of each term should be clarified and the traditional way to do this is by adding to it a qualifier in parentheses. The qualifier should be as brief as possible, ideally consisting of one word. Often a broader term, the qualifier should indicate the context or subject area to which the concept belongs. It forms part of the term and does not serve as a scope note.” [ISO 25964-1, 6.2.2]
Terms which have two or more meanings (homographs) are common in natural language. Homographs cause problems in thesaurus maintenance, indexing and retrieval. According to the ISO standard, qualifiers are added to the homograph in parentheses, forming an integral part of the term. This method can lead to the accumulation of arbitrary variants of qualifiers.
Sometimes, to solve ambiguity the solution is providing scope notes.
A SKOS vocabulary is essentially a list of SKOS ‘concepts’, plus some metadata. Each SKOS concept has some or all of the following features (this list is not comprehensive), of which only the first two are required.
- A single URI representing the concept, mainly for use by computers (that is, it is not required to be human-readable). This is a syntactic requirement.
- A single prefered label in each supported language of the vocabulary, for use by humans. This is required.
- Alternative labels which applications may encounter, whether simple synonyms or commonly-used aliases.
- A definition for the concept, where one exists in the original vocabulary, to give a meaning for the term. This need not be extensive, but it should indicate to someone using the vocabulary what the Concept is intended to refer to, and how precise it is expected to be.
- A scope note to further clarify a definition, or the usage of the concept. While the definition explains the meaning of the term in some abstract sense, the scope note provides practical usage hints to a user of the vocabulary. For example, a scope note might say ʻThis is not to be confused with…ʼ or ʻIn the case X, use term Yʼ.
- A concept may also be involved in any number of relationships with other concepts. The types of relationships are Narrower or more specific concepts.
- Broader or more general concepts.
- Related concepts.
Only the URI and preferred label are required; all of the remaining features are optional. In addition to the information about a single concept, a vocabulary can contain information to help users navigate its structure and contents:
- The “top concepts” of the vocabulary, i.e. those that occur at the top of the vocabulary hierarchy defined by the broader/narrowerrelationships, can be explicitly stated to make it easier to navigate the vocabulary.
- Concepts that form a natural group can be defined as being members of a “collection”.
- Versioning information can be added using change notes.
- Additional metadata about the vocabulary, for example indicating the publisher, must be documented using the Dublin Core metadata set. At a minimum, the vocabulary’s skos:ConceptScheme should be annotated with DC Terms “title”, “creator”,“description” and “created”.
A suitable skos:ConceptScheme declaration might start:
SKOS Good Practices
- Relationships (“broader”, “narrower”, “related”) between concepts SHOULD be present, but are not required; if used, they SHOULD be complete (thus all “broader” links have corresponding “narrower” links in the referenced entries and “related” entries link each other).
- A thesaurus SHOULD indicate the “top concepts” in the vocabulary (namely those concepts that do not have any “broader” relationships), using the skos:hasTopConcept relation. This should be done only if the vocabulary is structured enough that this is useful. [Rationale: “top concepts” can be of use as initial hints for a user interface which aims to help a user navigate through a thesaurus starting from “the top”; if a thesaurus is so flat that these “top concepts” would be too numerous, then the thesaurus developer may reasonably decidenot to add them].
- A thesaurus MUST contain only one skos:ConceptScheme.
- At a minimum, the thesaurus’s skos:ConceptScheme MUST be annotated with properties in the DC Terms namespace https://purl.org/dc/terms/, namely “dc:title”, “dc:creator”, “dc:created” and “dc:description”, with values of an appropriate type.
The Keyword List
This vocabulary is based on a set of keywords maintained. The intended usage of the vocabulary is to tag articles with descriptive keywords to aid searching for articles on a particular topic.
The keywords are organised into categories which have been modelled as hierarchical relationships. Additionally, some of the keywords are grouped into collections which has been mirrored in the SKOS version. The vocabulary contains no definitions or related links as these are not provided in the original keyword list, and only a handful of alternative labels and scope notes that are present in the original keyword list.
This vocabulary consists of a set of keywords organised into an enumerated hierarchical
structure. Each term consists of a taxonomic number and a label. There are no definitions, scope notes, or cross references.
UNEP (1997). EnVoc Multilingual Thesaurus of Environmental Terms. United Nations Environment Programme: Nairobi.
Williamson, N. & C. Beghtol (2004). Knowledge Organization and Classification in International Information Retrieval. Routledge: London.
Abel, F. (2008). “The benefit of additional semantics in folksonomy systems”. In: Proceedings of the 2nd PhD workshop on information and knowledge management. New York: ACM, p. 49-56.
ANSI/NISO Z39.19 (2005). Guidelines for the construction, format, and management of monolingual controlled vocabularies.
Avesani, P.; Cova, M. (2005). “Shared lexicon for distributed annotations on the web”. In: Proceedings of the 14th international conference on World Wide Web. New York: ACM, p. 207-214.
Broughton, V. (2006). Essential thesaurus construction. London: Facet Publishing.
Burke, R. (2000). “Knowledge-based recommender systems”. In: Encyclopedia of Library and Information Systems, vol. 69.
Cafarella, M. J.; Halevy, A.; Madhavan, J. (2011). Structured data on the web. New York: ACM, p. 72-79.
Davies, J.; Harris, S.; Crichton, C. et al. (2008). “Metadata standards for semantic interoperability in electronic government”. In: Proceedings of the 2nd international conference on theory and practice of electronic governance. New York: ACM, p. 67-75.
Echarte, F.; Astrain, J. J.; Córdoba, A. et al. (2009). “Acoar: a method for the automatic classification of annotated resources”. In: Proceedings of the fifth international conference on knowledge capture. New York: ACM, p. 181-182.
Gaus, W. (2005). Dokumentations-und Ordnungslehre: Theorie und Praxis des Information Retrieval. Berlin: Springer.
Golub, K.; Moon, J.; Tudhope, D. et al. (2009). “Entag: enhancing social tagging for discovery”. In: Proceedings of the 9th ACM/IEEE-CS joint conference on digital libraries. New York: ACM, p. 163-172.
Isaac, A.; Summers, E. (2008). Skos simple knowledge organization system primer.
ISO 2788 (1986). Documentation-guidelines for the establishment and development of monolingual thesauri.
ISO 5964 (1985). Documentation-guidelines for the establishment and development of multilingual thesauri.
ISO 25964-1 (2011). Information and documentation-thesauri and interoperability with other vocabularies-part 1: Thesauri for information retrieval.
Kless, D.; Milton, S. (2010). Towards quality measures for evaluating thesauri.
Kules, B.; Kustanowitz, J.; Shneiderman, B. (2006). “Categorizing web search results into meaningful and stable categories using fast-feature techniques”. In: Proceedings of the 6th ACM/IEEE-CS joint conference on digital libraries. New York: ACM, p. 210-219.
Miles, A.; Bechhofer, S. (2008). Skos simple knowledge organization system reference.
Miles, A.; Rogers, N.; Beckett, D. (2004). Skos-core guidelines for migration: guidelines and case studies for generating rdf encodings of existing thesauri.
Morshed, A.; Keizer, J.; Johannsen, G. et al. (2010). From agrovoc owl model towards agrovoc skos model.
Neubert, J. (2009). “Bringing the ‘thesaurus for economics’ on to the web of linked data”. In: Proceedings of the linked data on the web workshop, vol. 538.
Orlandi, F.; Passant, A. (2010). “Semantic search on heterogeneous wiki systems”. In: Proceedings of the 6th international symposium on wikis and open collaboration. New York: ACM, p. 4:1-4:10.
Park, J.; Bui, Y. (2006). An assessment of metadata quality: A case study of the national science digital library metadata repository.
Rodríguez, J. M.; Azcona, E. R.; Paredes, E. R. (2008). Promoting government controlled vocabularies for the semantic web: the eurovoc thesaurus and the cpv product classification system.
Sacco, O.; Bothorel, C. (2010). “Exploiting semantic web techniques for representing and utilising folksonomies”. In: Proceedings of the international workshop on modeling social media. New York: ACM, p. 9:1-9:8.
Sah, M.; Hall, W.; Gibbins, N. M. et al. (2007). “Semport: a personalized semantic portal”. In: Proceedings of the eighteenth conference on hypertext and hypermedia. New York: AMC, p. 31-32.
Sah, M.; Wade, V. (2010). “Automatic metadata extraction from multilingual enterprise content”. In: Proceedings of the 19th ACM international conference on information and knowledge management. New York: ACM, p. 1665-1668.
Soergel, D. (1994). Indexing and retrieval performance: The logical evidence.
Soergel, D. (2002). “Thesauri and ontologies in digital libraries: 1. structure and use in knowledge-based assistance to users”. In: Proceedings of the 2nd ACM/IEEE-CS joint conference on digital libraries. New York: ACM, p. 415-415.
Stvilia, B.; Gasser, L.; Twidale, M. B. et al. (2007). A framework for information quality assessment.
Tordai, A.; Ossenbruggen van, J.; Schreiber, G. (2009). “Combining vocabulary alignment techniques”. In: Proceedings of the fifth international conference on knowledge capture. New York: ACM, p. 25-32.
Waitelonis, J.; Sack, H.; Hercher, J. et al. (2010). “Semantically enabled exploratory video search”. In: Proceedings of the 3rd international semantic search workshop. New York: ACM, p. 8:1-8:8.
Wang, Y.; Stash, N.; Aroyo, L. et al. (2009). “Semantic relations for content-based recommendations”. In: Proceedings of the fifth international conference on knowledge capture. New York: ACM, p. 209-210.
De Norre, B. & D. Groenez (2002). A Multilingual Thesaurus for Accessing Eurostat’s Reference Databases. Joint UNECE/EUROSTAT Work Session on Statistical Metadata. Luxembourg, 6-8 March 2002.
De Norre, B.; Groenez, D. & M. Pellegrino (2004). Integrating Statistical Terminology Tools within Eurostat’s Dissemination Policy. Joint UNECE/EUROSTAT/OECD Work Session on Statistical Metadata. Geneva, 9-11 February 2004.
Diaz Munoz, P. (2008). The role of Statistical Data and Metadata Exchange in global statistical infrastructure. Statistical Journal of the IAOS, 25(2008): 47-54.
Felluga, B.; Plini, P.; Cunningham, G. & S. Lucke (2000). The Global Environmental Thesaurus Project. Global Conference on Access to Environmental Information. Dublin, 11-15 September 2000.
Nogueras, J. ; Zarazaga, F. & P. Muro (2005). Geographic Information Metadata for Spatial Data Infrastructures. Resources, Interoperability and Information Retrieval. Springer: Heidelberg.
OECD (2007). Data and Metadata Reporting and Presentation Handbook. Organisation for Economic Co-operation and Development: Paris.
Roe, S. & A. Thomas (2004). The Thesaurus. Review, Renaissance, and Revision. Routledge: London.
Rowley, J. & R. Hartley (2008). Organizing Knowledge: An Introduction to Managing Access to Information. Ashgate: London.
Severino, F. (2007). The term development in the thesauri of international organisations. The European Journal of Development Research, 19(2). 327-351.