Indexes versus IR databases; components of IR databases

From James D. Anderson and José Pérez-Carballo “Information Retrieval Design”:

The line between an index and an IR database is a very fine and fuzzy one. They are closely related and intertwined. Actually, as we explore the design of IR databases, we will find that an IR database generally consists of three major components, of which the index is one. The other two components are (2) the collection of documents (in full-text databases) and surrogates (representations or descriptions of documents), and (3) the collection of terms that is used for the description and retrieval of documents. In some IR databases, this third component, the collection of index terms, is expanded to include synonymous, equivalent and variant terms and relations among them. Such an elaborated vocabulary component is called a thesaurus.


Further Reading: Papers

The nature of indexing: how humans and machines analyze messages and texts for retrieval. Part I: Research, and the nature of human indexing (2001)
Introduction to modern information retrieval – McGill, Salton – 1983
Indexing by Latent Semantic Analysis – Deerwester, Dumais, et al. – 1990
Term-weighting approaches in automatic text retrieval – Salton, Buckley – 1988
Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer – Salton, editor – 1989
Improving retrieval performance by relevance feedback – Salton, Buckley – 1990
Frequency analysis of English usage: Lexicon and Grammar – Francis, Kucera – 1982
A probabilistic model of information retrieval: development and comparative experiments – Jones, Walker, et al. – 2000
Viewing morphology as an inference process – Krovetz – 1993
Bibliographic coupling between scientific papers – Kessler – 1963
Information Storage and Retrieval – Korfhage
Stemming algorithms: A case study for detailed evaluation – Hull – 1996
Information Retrieval Systems – Theory and Implementation – Kowalski – 1997
Improving browsing in digital libraries with keyphrase indexes. Decision Support Systems – Gutwin, Paynter, et al. – 1999
A Study of Information Seeking and Retrieving. III. Searchers, searches, and overlap – Saracevic, Kantor – 1988
Indexing and access for digital libraries and the Internet: Human, database, and domain factors – Bates – 1998
The Human Use of Human Beings: Cybernetics and Society – Wiener – 1956
Indexing and abstracting in theory and practice – Lancaster – 2003
A study of information seeking and retrieving. I. Background and methodology – Saracevic, Kantor, et al. – 1988
A Theory of Indexing – Salton – 1975
Interindexer Consistency Tests: A Literature Review and Report of a Test of Consistency – Markey – 1984
Using an N-gram-Based Document Representation with a Vector Processing Retrieval Model, The Fourth Text Retrieval Conference (TREC-3 – Cavnar – 1995
A Method for the Evaluation of Stemming Algorithms Based on Error Counting – Paice – 1996
Searchers’ selection of search keys: I. The selection routine – Fidel – 1991
Two kinds of power: An essay on bibliographical control – Wilson – 1968
Using Latent Semantic Indexing for Literature Based Discovery – Gordon, Dumais
Organizing Information: Principles of Data Base and Retrieval Systems – Soergel – 1985
The Organization of Information – Taylor – 1999
B.: Information ltering and information retrieval: two sides of the same coin – Belkin, Croft – 1992
Term Relevance feedback and mediated database searching: implications for information retrieval practice and system design – Spink – 1995
Rules of indexing: A critique of mentalism in information retrieval theory – Frohmann – 1990
Natural language information retrieval TREC-6 report – Strzalkowski, Lin, et al. – 1998
Consistency in the selection of search concepts and search terms – Iivonen – 1995
The Use of Categories and Clusters for Organizing Retrieval Results. Natural Language Information Retrieval – Hearst – 1999
Evaluating natural language processing techniques in information retrieval – Strzalkowski – 1999
Text retrieval conference – TREC
Exemplary documents: a foundation for information retrieval design – Blair, Kimbrough
T.: Natural language information retrieval: progress report – Perez-Carballo, Strzalkowski
The Association Factor in Information Retrieval – Stiles
Indexing using both n-grams and words – Mayfield, McNamee – 1999
Indexing Books – Mulvany – 2005
Rigorous systematic bibliography – Bates – 1976
Inter-indexer consistency studies, 1954-1975: a review of the literature and summary of study results – Leonard – 1977
Methodologies for subject analysis in bibliographic databases – Milstead – 1992
Subject Analysis and Indexing: Theoretical Foundation and Practical Advice. Frankfurt a – Fugmann – 1993
Indexing from a to z – Wellisch – 1995
Indexing documents by gedanken experimentation – Cooper – 1978
Some fundamental concepts of information retrieval. Drexel Library Quarterly – Wilson – 1978
A cognitive process model of document indexing – Farrow – 1991
editors. Challenges in indexing electronic text and images – Fidel, Hahn, et al. – 1994
Temporal Structures in Bibliographic Classification – Fairthorne – 1974



, ,




Leave a Reply

Your email address will not be published. Required fields are marked *