legalthesaurus.org

Faceted Classification

Faceted Classification

An alternative to hierarchies…

Articles

[Ranganathan: Ahead of His Century – Faceted Classification]
Faceted classification is also called analytico-synthetic, named after the two main processes involved in the composition of a call number [an alphabetic or alphanumeric notation]. The two processes are:

Analysis
Breaking down each subject into its basic concepts.
Synthesis
Combining the relevant units and concepts to describe the subject matter of the information package in hand.
Facets are “clearly defined, mutually exclusive, and collectively exhaustive aspects, properties or characteristics of a class or specific subject” ([Maple 1995])

[Faceted Classifications and Thesauri]
The facet classification is an analytico-synthetic scheme. It is analytic because it subdivides broader elements into single concepts that are clearly defined through facet analysis. It is synthetic in that new elements can be developed.

[Faceted classification of information]
Given the significant difficulties in categorizing books, papers, and articles using traditional library classification techniques, it would seem next to impossible for humans to classify the small chunks of rapidly changing information that characterize information-intensive business environments. But it’s not. Library and information science professionals have already provided the foundations of an alternative to traditional classification techniques: faceted classification.

[A Simplified Model for Facet Analysis]
…an attempt has been made to provide LIS? students and perhaps LIS? practitioners with a condensed model that gives an overview of the underlying principles of facet analysis that are common to both these theories, and which reflects common usage amongst the designers of faceted classification systems and IR thesauri.

[All about Facets and Controlled Vocabularies]
The first in a series of articles from BoxesAndArrows

[Use of faceted classification]
From WebDesignPractices? – examples of sites that use faceted classification, but not necessarily FacetedBrowsing

[Faceted metadata search – search tools report]
Combines faceted approaches with keyword searching

[How to Make a Faceted Classification and Put It On the Web]
A good thorough article

[Putting Facets on the Web: An Annotated Bibliography]
A good summary, plus an extensive bibliography

[Faucet Facets: A Few Best Practices for Designing Multifaceted Navigation Systems]
From JeffreyVeen?, AdaptivePath

[A Primer on Faceted Navigation and Guided Navigation]
A short evangelical article on faceted navigation

“Faceted classification serves up multiple “pure” classification schemes rather than a single “motley” Taxonomy.

Because each facet is focused on a specific, limited dimension of the information space, its hierarchy can be much smaller and flatter. Even with several facet hierarchies, you’re dealing with relatively few controlled vocabulary terms. However, through the power of post-coordination, you’re able to create a huge number of combinations.”

[The Speed of Information Architecture]

[peterme becomes a librarian]


Examples

See FacetedBrowsing for examples of sites that use a FacetedClassification to browse and filter content.

[NCSU Libraries]
Uses facets to refine search of the catalog. The facets appear to be well chosen and well indexed. Uses Endeca.
[FacetMap]
TravisWilson has built FacetMap, a demo that shows how to combine faceted classification, regarded as a “bottom-up” system, with the hierarchical navigation that’s typically considered a “top-down” structure, thereby giving (unsuspecting) users much more power over their browsing. The site lets you upload your own data to see how it can be browsed using the facetmap system.
[New York Citysearch restaurants]
The main categories are Neighborhood, Cuisine, and Price, and the facets are top pick, user rating, etc.
[The Berkeley FlamencoProject]
FLexible information Access using MEtadata in Novel COmbinations. Includes a number of [articles] and a [working proof of using faceted metadata for an image library search]. PeterMe has a [blog entry] discussing this.
[Demo from Vindigo]
Cuisine is the category and price, top pick, and distance are facets.
[DiamondWiki]
an experimental wiki focusing on FacetedClassification, using MetaData embedded directly in Wiki pages
[Nordstrom – advanced search]
Category must be chosen first and sub-categories are based on this. Results are not displayed as you go, and it is possible to get a null result.
[Kohler Bathroom Faucets]
Categories include category (this must be selected first), price, colours, style. Doesn’t display any results as you filter, just shows how many results there are.
[Wine.com]
Wine.com have again implemented a very good FacetedBrowsing interface. They were one of the first, then dropped it and now it is back.
[Meta Matters Subject Index]
This site indexes content with facet hierarchies and uses them as separate hierarchies to get to content. There is no ability to filter with one facet and then another.

Other collections of FacetedClassification resources

[AIfIA Library results for FacetedClassification]

[KeithInstone has a collection of stuff for FacetedBrowsing]

[Dey Alexander has a collection]

Discussion

So, is a FacetedClassification just a DAG? of categories? Is this similar to how MeatBall:HyperbolicBrowser is in MeatBall:CategoryInformationVisualization, which in turn is inMeatBall:CategoryInterfaceDesign and MeatBall:CategoryNavigation, except the system automatically builds the categorization “hypertree” (i.e. a directed acyclic graph — a DAG?) from these labels? — SunirShah

Uhmmm .. no. The classic example of a faceted information domain is wine, where any given bottle of fermented grape juice gets classified in a small number of “facets” such as varietal, location of origin, price bracket, cellaring, and food suitability.

Thus the bottle of Moondah Brook Chenin Blanc 1995 I have on my shelf would have these facet values: Chenin Blank : Darling Ranges, Western Australia : $10-20 : Drink now to 2000 : Pork. The thing is, these facets are not arranged into a hierarchy of Australian Wines -> Western Australia -> Whites -> Chenin Blanc -> Pork –> retchedly cheap plonks –> drink now.

Personally, coming from a database design technical background, I can’t quite get excited by the idea that a mass of data could be segmented by a predefined set of key-fields which have a controlled vocabulary. It’s generally a good idea, just not a new and novel one (for me) – I do slicing and dicing of data like this all the time. It is new and novel to people that come from disciplines where information is arranged into hierarchies.

What I do consider new and novel though is a combination of facets and hierarchy .. have a look at TravisWilson‘s [proof of concept].

EricScheid

Ok, I finally get it. The list of links are organized into columns, Varietal : Region : Price, which form a conjunctive filter against the database. Hmm.. Well, if you investigate expert, knowledge, and inference systems from the early 80’s, there’s a lot of work already done on building and searching epistemologies. While novel in the web directory space (though powerful and useful), this is still pretty trivial. For instance, the immediately obvious next move is to change the taxonomies to be DAGs? instead of mere trees. — SunirShah

Sunir, this is interesting…you might be the one to bridge the world of graphs with the world of library science for us 🙂 Have you looked at Ontologies (see link below)? Ontologies go beyond taxonomies in that they also describe the relationships among the terms in the taxonomy. — VictorLombardi

The ontologies I think those links are talking about are really classification systems, not like the theory of types that Russell and Whitehead had to invent to avoid Russell’s paradox–and failed at doing. At a intuitive level, these kinds of ontologies are only useful for small sets of data. If you want to create something like http://dmoz.org, any classification system will inevitably break down. If you just want to describe your catalogue to your customers, then you can just deal with it mindfully. Even the Dewey Decimal System doesn’t have a good way of numbering books like Cent Mille Milliards de Po�mes (a “poem generator”), Windows98, or even MeatBall:NealStephenson’s MeatBall:CryptonomiconBook. It’s especially difficult because much of western art is groundbreaking in only that it is deconstructive of the current classification system, or “box.” Look at Jean Jacques Rousseau and hisConfessions for an example that has lead us to today’s Hollywood tell-all autobiographies. So, anything you make someone will come along and unmake to effect aMeatBall:ParadigmShift. Like, what do you think a paradigm is except for the facets that select certain exemplars to be in a certain classification?

I don’t really know enough math or philosophy (yet) to discuss this in depth, and it’s already getting wanky. Like isn’t a taxonomy just a tree-based ontology? I’m personally thinking in terms of self-organizing maps, a totally useless thing for a database with a nice modernist, clearcut feel to it–like on a commercial website. It’s only interesting to me for analyzing dynamic graphs, like Usenet postings or wikis. Maybe this would be a good master’s thesis for me to consider. 🙂 I’ll get back to this, either here or on MeatballWiki?, at some random time in the future. As an information scientist, I suppose I should really get around to learning the theory of information and knowledge, eh.

Anyway, for all practical purposes, designing a cataloging system for a bunch of data is as simple as looking at the data set and seeing how it’s structured. In the case of wines, wines are defined by their varietal taxonomy, almost exclusively by French law (there is a Wiki:ThreeRingBinder thicker than you can imagine that says what is a champaigne and what isn’t). Geography has also generally been put into a hierarchy–although not always, and that is one place where things could break down, if only Ethiopia was a wine powerhouse–and price is just a scalar axis. It seems to me just like why you would choose a list selection box over a text box when designing a user interface. If the classification system is hierarchical, make the interface a directory. Wilson’s price axis could have been done better with a slider, I think. — SunirShah

Looking at the data and determining the commonalities is easy enough. The problem comes when one tries to organise these into a hierarchy of some kind … do you split the collection into regions first, and then within each region subdivide by varietal; or split the collection into varietals, and then within each varietal subdivide by region. This is the problem that yahoo and dmoz faces every day. However, if you discard the concept of their being one true hierarchy and instead allow dynamic hierarchies be developed by the user as needed, simply by applying more filters, then what you have is a faceted approach. It’s only a minor extension of that to then have hierachical facets (like regions). — EricScheid


The facet browser at the FlamencoProject [1] has prompted a thought — on this site, like many wiki’s, each page could be tagged with many Category tags. Now think of each Category tag as a boolean facet, and imagine a facet browser for exploring those links. The top level would be a list of all the categories. Click into one category and see a list of all categories mentioned on pages within that category (minus the current one). Consider each category to be a hierarchical facet, with all the linked pages being values within that facet. This concept is straying far from the canonical “faceted classification” definition though. Have to think some more on this. — EricScheid

And one of the pages in MeatBall:CategoryInterfaceDesign would be another MeatBall:CategoryInformationVisualization which you could click on to expand that category. However,MeatBall:CategoryInformationVisualization is also in MeatBall:CategoryNavigation, giving you a DAG?. — SunirShah


Multivalence is not a vice

Do the values of a facet need to be mutually exclusive? Because I’m working with content and concepts, more complex (“fuzzy”) than a bottle of wine, I will allow an internal tagger to select multiple values from a facet to describe an entity. I’m not sure if that, classically, is considered a no-no. [2]

There are two questions here: (1) the nature of the facet values and (2) tagging content with multiple values.

With regard to the facet values themselves, optimally, the values should be mutually exclusive. That is, it should be (relatively) easy to determine whether or not to choose term A or term B. Of course, given the nature of language itself and the nature of the content, you may have some fuzziness. But it’s best to keep that to a minimum so that content taggers can do better tagging and users can more easily find what they are looking for.

Tagging a document with multiple facet values is clearly acceptable if the item being tagged deals in a significant way with more than one topic. There’s nothing wrong with that. In fact, it should be tagged with all relevant terms. And that’s very different from the traditional way of classifying items for a library, for instance, where there is only one “best place” for an item. Here, you want users to be able to find whatever relevant information a document contains, whether it be focused on a single topic or on multiple topics.

What you don’t want to have happen on the other hand is tagging with multiple terms that mean nearly the same thing just because the vocabulary is too fuzzy and you need to cover all your bases.