Types of databases
James D. Anderson and José Pérez-Carballo “Information Retrieval Design”:
Databases (along with the systems for access that accompany those in electronic form) can be categorized in many ways: by mission or purpose (such as MIS: management information systems), by subject areas (such as GIS: geographical information systems), by models of organization (such as relational, hypertext, object-oriented, flat-file), or by phenomena represented by data (such as real, concrete entities (things, objects!) and events versus messages about entities and events, including abstract and imaginary entities and events). This book focuses on databases designed for the purpose of facilitating discovery and retrieval of messages of all types, so our databases are called “information retrieval databases” or, for short, “IR databases.” Their purpose is information retrieval. The primary data in such databases describe messages rather than concrete entities and events.
Varieties of IR databases
“Database” is a convenient word for the enormous variety of IR tools that librarians, indexers, abstracters, and information specialists of various sorts have developed over the years — indexes, indexing and abstracting services, bibliographies, catalogs, gazetteers, dictionaries, concordances, directories, encyclopedias, handbooks. All of these are organized collections of data designed for retrieval, so all can be legitimately called IR databases.
Two Types of Databases
Databases for concrete entities and events versus IR databases
Databases can be categorized in many ways — by data models, by purpose, by subject area, and by the kinds of phenomena represented. It is this last categorization that is central to this book. With respect to the primary phenomena represented, databases can be divided into two types: (1) concrete entity and event databases, and (2) IR databases. By far the most common in everyday life are type-1 databases — concrete entity and event databases. These databases are designed to provide information about concrete entities (things, objects) and concrete events (transactions, operations, processes). Bank databases and airline databases were cited as examples in section 1.3. Another example of concrete entity and event databases are university databases, containing as they do information about every student, course, course offering, instructor, classroom, grade, tuition payment — all concrete entities and real events. In contrast, IR databases focus on messages, and these messages frequently relate to phenomena that are abstract, vague, emotional, and imaginary — anything but concrete!
Concrete entity and event databases
Concrete entity and event databases are designed around the attributes and relationships among concrete entities and events. For example, students take courses, get grades, and pay tuition. Instructors teach courses and get paid a certain amount every so often. In contrast, most IR databases do not attempt to define possible relations in advance. There are just too many potential relationships among concepts represented in messages and texts, and some of these relationships are only discovered later through subsequent use and analysis.
For information about the category of concrete entity and event databases, the reader should acquire a good book on database management systems (DBMS). Most books using that term are talking about concrete entity and event databases. Management information systems (MIS) also consist largely of concrete entity and event databases, although MIS people are paying more and more attention to IR databases.