Web Classification

Web Classification


“It is a question of how those who create the information base structure and organize concepts relating to a data entity and how the retrieved concepts and records are organized and presented in a sequence and format helpful to users, so that the latter feel the system not only provides timely, precise and complete information but also is friendly.”

(Neelameghan 1992, 204)

Similarity in Evolution

If one considers the development of models of database designs and the development of classification schemes, one may notice a parallel between the two, moving from a more hierarchial structure to a faceted one. Relational databases have replaced the older hierarchial databases, and are found to be more effective in organizing data.

Neelameghan’s research (1992) makes a clear comparison between the evolution of classification theory and database’s data model development. He compares the process of designing a database to designing a classification scheme (206):

Database – Classification Scheme

Data entities (objects about which data is to be collected) – Subjects

Attributes of data entities – Attributes or facets of subject

Data model to map entities and relationships – Classification scheme to map concept and isolates.

(Pollitt has also identified this parallel in his paper.)

Construction of Database

It is therefore considerably easier to construct a relational database using facets than using the enumerative schedules of the traditional classification systems since the two share a similar underlying structure. A table can be the subject and it will contains fields which are the facets and attributes. All the available units can be entered and the classifier only needs to pick from this list of terms available to construct the desired notation.

Perhaps one of the ways to aid the user in deciding what search term to use is to provide a list of all the options in a particular facet. This list can easily be called up in a relational database. The job of the cataloguer can also be made considerably easier if the terms in each facet are made readily available as a list online.


The “Light Bulb” for Users

Often, a library user approaches the online catalog at a computer terminal without knowing what terms he or she should use to search for the desired material.

In addition, there is no intuitive way to narrow a search. For instance, in an academic library, a student conducting a research may be looking for something very specific. A catalog that allows the user to narrow the search and sets the parameters will be particularly valuable.

A classification scheme can “provide a useful tool for the questioner to define what his question really is” (Vickery 1966, 15). Unfortunately, in most catalogs, call numbers are not made into another common tool of retrieval. It seems to be a waste of all the laborious work that has gone into developing the “right” notations.

Perhaps this is due to the fact that it is hard to formulate a search using notations that have been developed from an enumerative scheme. Even if the notations do indicate the specificity of the content of the item, it is difficult to isolate the different aspects in the notation for the user to conduct a search.

However, if faceted classification is employed, then it becomes a simple process in identifying the different aspects represented by the notation since the aspects already comes in distinct units. Faceted anaysis is also argued to provide a “natural” framework in the interpretation of a user’s inquiry (Foskett 1992, 230).

“Faceted analysis…does reflect a natural way of thinking, because it separates out the various elements of a compound subject, by means of relating them to certain general categories which are comprehensible to any user.”

(Foskett 1992, 232)

Importance of Controlled Vocabulary

To be able to converse and understand one another, the user of an online information retrieval system and the sytem itself have to use a common language. This is the basis for the need and development of controlled vocabulary.

Numerous subject headings, indexing languages and lists of controlled vocabulary have been developed to provide access to the subject content of an information package.

However, we often forget that the classification scheme is a list of subjects too. In the painstaking task of assigning a call number to an information package, someone has already done an analysis of its content. It is easy and convenient to just use the terms in the classification scheme.

The compilation of the basic concepts and categories of the faceted classification scheme is basically building a controlled vocabulary list (Vickery 1966, 9) The advantage of using such control is that it brings the experience of the indexer and the user closer together.

By providing the users with the opportunities to combine the basic concepts at will, this will result in a query that accurately reflects the user’s need.

Subject headings seem to be a very useful tool in helping users of the catalog to get more details about the content of the book. However, since not many users are familiar with the format of the subject headings, the label “subject search” in many OPACs can be misleading for a user who does not realise that the term keyed in has to match a controlled vocabulary list.

Solution: A View-Based User Interface

One form of user interface that is worthy of investigation and further development is the view-based user interface. A good example and account of such an interface can be found in Pollitt’s paper (1997). In the paper he shows screenshots from the EMBASE database, which provides a “view” or a window for each facet. Inside the view are the available terms for that facet. This will often help the user in defining his or her search criteria.


Organizing the Chaos of the Web

There are numerous attempts to try to establish order and apply organization to the chaos of the Internet and the World Wide Web. Anyone who has conducted a search on the Web will immediately realize the difficulty in retrieving relevant results. In evaluating all the possible methods to achieve the task of organizing the Web, perhaps one should consider the Colon Classification as a candidate.

Colon Classification: A Worthy Candidate

One of the major criticisms of the Colon Classification is its lengthy notation, which is of low practicality in terms of putting the call number on the spine label of a book, or requiring the user to write down or memorize the whole notation to locate the particular item in the library. However, this problem disappears when we consider the environment of the web, since we do not have to worry about the “physical” location of a document (Pollitt 1997).

Perhaps, the concept of analytico-synthetic classification has already been applied to the Web. An interesting article, Glassel (1998), has compared the nature of Ranganthan’s Colon Classification to Yahoo!, a popular search directory on the web. Both systems are based on combining facets to facilitate searching and maximize the number of relevant results.

“One important advantage that virtual collections such as Yahoo! have over the print environment, in terms of notation schemes and their citation order (the order in which the facets are put together), is that the order of the facets in a string doesn’t have to be set in stone. An electronic resource isn’t limited to a single physical location. In a library, a book is only supposed to live in one place on a shelf. In the digital world, what is to stop us from classifying a resource in multiple places within a hierarchy? Nothing!” (Glassel 1998)

Web directories like Yahoo! are an example of a concept-based system (Ellis & Vasconcelos 1999, 7). An indexer reviews the document and assigns appropriate subject terms to describe it. The implementation of the Colon Classification scheme on the web may be useful in that it provides the website creator, the indexer, and the user a common language to describe and identify the content of the page. Since the Colon Classification is very accomodating to new concepts and new composite subjects, it is quite appropriate for the fast-growing web environment.

As with online retrieval systems, the advantage of a faceted scheme is that it is more attentive to the user’s need. Every query is unique and comes from a specific perspective. What is relevant to one user is different from another.


Chan, Lois Mai. Cataloging and Classification: An Introduction. 2nd ed. New York: McGraw-Hill, 1994.

Ellis, David, and Ana Vasconcelos. “Ranganathan and the Net: Using Facet Analysis to Search and Organise World Wide Web.” Aslib Proceedings: New Information Perspectives 51, no. 1 (1999): 3-10.

Foskett, A. C. The Subject Approach to Information. 4th ed. London: Clive Bingley, 1982.

Foskett, D. J. “Ranganathan and ‘User-Friendliness’.” Libri 42, no. 3 (1992): 235-241.

Garfield, Eugene. “A Tribute to S. R. Ranganathan, the Father of Indican Library Science. Part 1. Life and Works.” In Essays of an Information Scientist 7 (1984): 37-44.

Glassel, Aimee. “Was Ranganthan a Yahoo!?” End User’s Corner (17 March, 2000).

Godert, Winfried. “Facet Classification in Online Retrieval.” International Classification 18, no. 2 (1991): 98-109.

Kwasnik, Barbara H. “The Role of Classification in Knowledge Representation and Discovery.” Library Trends 48, no. 1 (1999): 22-47.

Maple, Amanda. “Faceted Access: A Review of the Literature.” (16 March, 2000)

Neelameghan, A. “Application of Ranganathan’s General Theory of Knowledge Classification in Designing Specialized Databases.” Libri 42, no. 3 (1992): 202-226.

Pollitt, Steven. Interactive Information Retrieval Based on Faceted Classification Using Views (16 March, 2000)

Ranganathan, S. R. Elements of Library Classification. Bombay: Asia Publishing House, 1962.

Vickery, B. C. Faceted Classification Schemes. Rutgers Series on Systems for the Intellectual Organization of Information, ed. Susan Artandi, v. 5. New Brunswick, NJ: Rutgers University Press, 1966.

Wali, M. L., and R. K. Koul. “Development of Notation in Freely Faceted Classification: A Case Study.” Herald in Library Science 11, no. 1 (1972): 30-43.