2.1 Motivation (Page 1)

  • Relational Model Foundation:
    • The relational model is the basis for many commercial Relational Database Management Systems (RDBMS) like DB2, Informix, Oracle, Sybase.
    • Structured Query Language (SQL) is the widely accepted standard for data retrieval and updates in RDBMS.
  • Relational Model Simplicity & Limitations:
    • Views data primarily as tables (rows and columns).
    • Typically stores basic data types (integer, string, decimal).
  • Challenges for Traditional RDBMS:
    • Not suitable for complex applications: Struggle with applications requiring complex data structures or new data types. Examples include:
      • CAD/CAM (Computer-Aided Design/Manufacturing)
      • GIS (Geographic Information Systems)
      • Multimedia databases
      • Imaging and Graphics
    • Limited Type System Extensibility: Users typically cannot add new data types to the system.
    • First Normal Form (1NF) Restriction: RDBMS generally enforce 1NF, meaning every column must hold an atomic value. Collections like sets, lists, or nested tables within a column are not allowed.
  • Rise of OODBMS: Due to these limitations and new application needs, research into Object-Oriented Database Management Systems (OODBMS) began in the early 1980s.

2.2 Concept & Features (Page 1)

  • Lack of Initial Formal Specification: Unlike the relational model defined by Codd, OODBMS initially lacked a single, clear specification, even with products already on the market.
  • The “Manifesto” (1989):
    • An attempt to define OODBMS features presented at the First International Conference on Deductive, and Object-oriented Databases.
    • Distinguished between mandatory, optional, and open features.
  • Core Idea: An OODBMS combines features from traditional DBMS with features from object-oriented systems.

2.2.1 Mandatory features of object-oriented systems (Page 2)

These features must be present for a system to be considered object-oriented (according to the manifesto authors).

  • Support for Complex Objects:
    • Allows objects to contain attributes that are themselves objects.
    • The schema of an object is not restricted to First Normal Form (1NF).
    • Examples of attribute types forming complex objects: lists, bags (multisets), embedded objects.
  • Object Identity (OID):
    • Every object instance in the database has a unique identifier (OID).
    • The OID is a property of the object itself, distinguishing it from all other objects.
    • An object’s identity remains constant throughout its lifetime, regardless of changes to its attribute values.
    • Objects have an existence independent of their value.
  • Encapsulation:
    • Enforces information hiding.
    • An object’s internal state (data/attributes) can only be accessed and modified by invoking operations (methods) defined within the object’s type/class.
    • Operations intended for external use are made visible through a public interface (e.g., public clause).
    • In an OODBMS, encapsulation means only the operations are visible to the programmer; the data and the implementation details of the operations are hidden.
  • Support for Types or Classes:
    • Used to group similar objects. Systems typically support one or the other, not both interchangeably for the same purpose.
    • Type: Summarizes common features (attributes, operations) of a set of objects. Often used at compilation time for correctness checks (type checking).
    • Class: Similar concept to type but associated with run-time execution. Refers to the collection of all object instances sharing the same internal structure (attributes) and methods. Objects are instances of their class.
  • Class or Type Hierarchies (Inheritance):
    • Allows classes/types to be organized into hierarchies.
    • A subclass (or subtype) inherits attributes and methods from its superclass (or supertype).
    • Promotes code reuse and modeling of is_a relationships.
  • Overriding, Overloading, and Late Binding (Polymorphism):
    • Overloading: Defining multiple methods with the same name within a class, but with different parameter lists (number or type of parameters). The correct method is chosen based on the parameters provided at compile time (usually).
    • Overriding: A subclass provides its own specific implementation of a method inherited from its superclass. The method signature (name, parameters) remains the same.
    • Late Binding (Dynamic Binding): The specific implementation code to execute for an overridden method call is determined at run-time, based on the actual class of the object receiving the message, not just the declared type of the variable holding the object. This is essential for polymorphism.
  • Computational Completeness:
    • The Data Manipulation Language (DML) should be a full-fledged programming language, capable of expressing any computable function (like Pascal, C, C++, Java).
    • Contrast with SQL: SQL is relationally complete (can express any query from relational algebra) but not computationally complete.
    • Impedance Mismatch: In RDBMS, SQL (set-oriented) is often embedded in a host programming language (record-at-a-time oriented). This difference in data handling paradigms causes the “impedance mismatch”, making development awkward.
    • OODBMS Goal: OODBs aim for a seamless integration between the database and the programming language, often using one computationally complete language for both general programming and data manipulation, thus overcoming the impedance mismatch.
  • Extensibility:
    • The ability to define new data types.
    • These new types should have the same status and usability as system-defined (built-in) types.

2.2.2 Mandatory features of database systems (Page 2)

These are standard features expected from any DBMS, including OODBMS.

  • Persistence:

    • Data must survive the termination of the process that created it.
    • Data needs to be stored permanently on secondary storage (e.g., disk).
  • Secondary Storage Management:

    • The DBMS must efficiently manage large amounts of data stored on secondary storage.
    • Includes techniques like indexing, caching, buffer management, data placement to optimize performance.
    • These management details are typically hidden from the user.
  • Concurrency Control:

    • The system must provide mechanisms to control concurrent access to data by multiple users.
    • Ensures data integrity and consistency despite simultaneous operations (e.g., using locking). Similar to mechanisms in conventional RDBMS.
  • Recovery:

    • The system must provide mechanisms to restore the database to a consistent state after failures (e.g., system crashes, transaction failures). Similar to mechanisms in conventional RDBMS (e.g., using logging).
  • Ad hoc Query Facility:

    • Must provide a high-level, efficient, application-independent way to query the data.
    • This doesn’t necessarily have to be a textual query language like SQL; it could be a graphical query interface or other high-level tools.
  • Standards Efforts: The Object Data Management Group (ODMG) worked on creating an industry standard for OODBMS (e.g., ODMG-2 released in 1997).

2.3 Making OOPL a Database (Page 3)

  • Basic Principle: An OODBMS provides DBMS capabilities (like persistence, querying, concurrency, recovery) for objects created using an Object-Oriented Programming Language (OOPL).
  • Integration Approach: Add persistence to objects defined in a native OOPL (like Java, C++, Smalltalk).
    • The OOPL is extended with constructs like:
      • Persistent Class
      • Database Class
      • Database Interface
      • Database API
    • These extensions provide DBMS functionality integrated into the programming language.
  • Beyond Simple Persistence: OODBMS often provide advanced features, driven historically by needs of applications like CAD/CAM:
    • Fast navigational access (following object references).
    • Support for versions (tracking changes to objects over time).
    • Long transactions (transactions that may last hours or days).
    • Support for persistent objects from multiple programming languages.
    • Distribution of data across multiple servers.
    • Advanced transaction models (e.g., nested transactions).
    • Schema evolution (managing changes to class definitions over time).
    • Dynamic generation of new types.

2.3.1 Object data modeling (Page 3, 4)

  • Components of an Object:

    1. Structure: Attributes and relationships to other objects (e.g., aggregation, association).
    2. Behavior: A set of operations (methods) that can be performed on the object.
    3. Characteristics of Types: Generalization/Specialization (inheritance).
  • Analogy to ER Model: An object is similar to an entity in the Entity-Relationship (ER) model.

  • Figure 2: Book Example Description:

    • A UML-like diagram illustrating object modeling concepts.
    • Classes: Book, Publisher, Author, Chapter, FictionBook, ArtBook.
    • Attributes:
      • Book: title (String), ISDN (Int)
      • Publisher: name (String), registerNo (Int)
      • Author: name (Name - likely String), authorNo (Int)
      • Chapter: name (String)
      • FictionBook: age (Int) - Note: age seems odd for a FictionBook, perhaps genre or similar was intended? Sticking to text.
      • ArtBook: style (String)
    • Relationships:
      • publishedBy: Association between Book and Publisher. Cardinality (Publisher) to (Book). A book is published by exactly one publisher; a publisher can publish one or more books.
      • writtenBy: Association between Book and Author. Cardinality (Author) to (Book). A book is written by exactly one author; an author writes exactly one book (in this simplified model). Note: This 1:1 seems overly simplistic for reality but reflects the diagram.
      • composedOf: Aggregation relationship between Book and Chapter. A Book is composed of Chapters.
    • Inheritance: FictionBook and ArtBook inherit from Book (indicated by arrows pointing to Book, though not explicitly drawn with standard UML inheritance arrows in the OCR rendition).
  • Object Structure Definition (Example using pseudo-code):

    class Book {
        title: String;
        ISDN: Int;
        publishedBy: Publisher inverse publish; // Relationship attribute
        writtenBy: Author inverse write;     // Relationship attribute
        chapterSet: Set<Chapter>;             // Complex attribute (collection)
    }
     
    class Author {
        name: String;
        authorNo: Int;
        write: Book inverse writtenBy;        // Inverse relationship attribute
    }
  • Attributes:

    • Similar to fields/columns in a relational model.
    • Can be complex types (e.g., Publisher, Author, Set<Chapter> in the Book example). These are objects themselves.
    • In RDBMS, complex attributes are typically represented by foreign keys linking to other tables.
  • Relationships:

    • Represent associations between objects.

    • Examples: publish (inverse of publishedBy), writtenBy (inverse of write).

    • Cardinalities: (e.g., publishedBy), (e.g., writtenBy).

    • composed_of represents aggregation.

    • Realization of Relationships:

      • Often realized using attributes of complex types (object references).
      • Often managed at the behavioral level (through methods). Example: A relationship like Publisher publishing many Books.
      class Publisher {
          // ... other attributes
          publish: Set<Book> inverse publishedBy; // Set of references to Books
       
          Method insert(Book book) {
              publish.add(book); // Add book reference to the set
              // Potentially also set the inverse reference: book.publishedBy = this;
          }
          // ... other methods
      }
  • Generalization/Specialization (Inheritance):

    • The is_a relationship.
    • Supported via class hierarchy.
    • Subclasses inherit attributes and methods from superclasses.
    • Example: ArtBook is a Book.
    class ArtBook extends Book { // ArtBook inherits from Book
        style: String;
    }
  • Message:

    • The means by which objects communicate.
    • A request from one object (sender) to another object (receiver) to execute one of its methods.
    • Example: Publisher_object.insert("Rose", 123, ...) - sends the insert message to the Publisher_object.
  • Method:

    • Defines the behavior of an object; the implementation code that responds to a message.
    • Purposes:
      • Change the object’s state (modify attribute values).
      • Query the value of selected attributes.
    • Example: The insert method defined in the Publisher class responds to the insert message.

2.3.2 Persistence of objects (Page 5)

  • Definition: Persistence allows program components (specifically objects in OODBMS) to “survive” after the program that created them terminates.
  • Mechanism: Requires storing these components permanently on secondary storage.
  • Making Objects Persistent (Two common ways):
    1. Explicit Call: Call a specific function or method (e.g., persistence()) on an object instance to make that specific object persistent.
    2. Automatic (by Type): Declare a class/type as persistent. All objects created from that persistent class/type are automatically persistent.
  • Market Examples: Gemstone, Objectivity/DB, ObjectStore, Ontos, O2, Itasca, Matisse.
  • Common Features of OODB Products:
    • Support an object-oriented data model.
    • Allow users to create new classes with attributes and methods.
    • Support inheritance.
    • Assign a unique OID to each object instance.
    • Allow retrieval of instances (individually or collectively).
    • Allow loading and running methods on objects.
  • Unified Language: Most OODBs provide a single language (e.g., C++, Smalltalk, Java) for both general-purpose programming and database manipulation, avoiding the impedance mismatch.

2.4 GemStone (Page 5)

  • Overview:
    • One of the first commercial OODBMS products.
    • Developed at Servio Logic (later GemStone Systems).
    • Based on the Smalltalk object-oriented language, with few extensions.
    • Merges OO language concepts with database system concepts.
    • Provides OPAL: an object-oriented database language for Data Definition (DDL), Data Manipulation (DML), and general computation.

2.4.1 Architecture (Page 5, 6)

  • Client/Server Architecture: Distributed over two main process types.

  • Processes:

    1. Stone Process:
      • Runs on the server.
      • Provides core data management capabilities:
        • Disk I/O
        • Concurrency Control
        • Recovery
        • Authorization
      • Uses unique Object-Oriented Pointers (OOPs) as object IDs.
      • Uses an object table to map OOPs to physical storage locations.
    2. Gem Process:
      • Can run on the server or a client machine.
      • Provides user-facing facilities:
        • Compilation of code (OPAL).
        • Browsing capabilities (examining objects and classes).
        • User authentication.
  • Figure 3: GemStone Architecture Description:

    • Shows a VAX system (potentially a client or server) connected via LAN.
    • Multiple GEM Process instances exist.
    • GEM Process interacts with Network Software.
    • Network Software communicates with the STONE Process.
    • STONE Process handles VMS File I/O (specific to the VAX VMS operating system shown).
    • VMS File I/O interacts with the physical Database on storage.
    • Illustrates the separation of concerns between Gem (application/user interface logic) and Stone (storage/database engine logic).

2.4.2 Object model (Page 6)

  • Closely related to the Smalltalk-80 object model.
  • Principal Concepts:
    1. Object: The fundamental building block.
    2. Message: The means of communication between objects.
    3. Class: The template for creating objects.
  • Persistence: All objects in GemStone are persistent by default.

2.4.2.1 Classes (Page 6)

  • Every object is an instance of exactly one class.
  • A class groups objects that share the same internal structure (instance variables) and behavior (methods).
  • Objects belonging to a class are called instances of that class.

2.4.2.2 Objects (Page 6)

  • A chunk of private memory with a public interface.
  • Internal Structure: Divided into fields called instance variables.
  • Instance Variables: Hold values, which are themselves other objects (everything is an object).
  • Communication: Objects communicate by passing messages.
  • Hierarchy Root: Object is the root class of the entire class hierarchy.

2.4.2.3 Messages (Page 6)

  • All actions in GemStone are invoked via message passing.

  • A message is a request for the receiving object to perform an action (change state or return a result).

  • Protocol: The set of messages an object can respond to defines its “public interface” or protocol.

  • Encapsulation Enforcement: An object can only be inspected or changed through its defined protocol (by sending messages it understands).

  • Message Expression Format:

    • : An identifier or expression denoting the object that receives and interprets the message.
    • : Specifies the selector (name of the operation) and any required arguments.
  • Figure 4: Message passing in GemStone Description:

    • Shows a sender object () sending a message to a receiver object ().
    • The message contains a selector.
    • The receiver () invokes the corresponding method based on the selector.

2.4.2.4 Methods (Page 7)

  • The concrete code implementations that are executed when an object receives a corresponding message.
  • An object only responds to a message if it has a method whose selector matches the message selector.
  • Methods are used to:
    • Query the object’s state (access internal structure/instance variables).
    • Manipulate the object’s state (modify internal structure/instance variables).

2.4.3 Collection classes (Page 7)

  • While a class defines object structure, it doesn’t typically keep track of all its instances directly.

  • Collection Objects: Used to store groups of other objects (instances).

    • Examples: Arrays, Bags, Sets.
    • Can store objects that are not necessarily of the same type.
    • Store objects in indexable or anonymous storage slots.
  • Built-in Support: GemStone provides a pre-defined Collection class and various subclasses for managing groups of objects.

  • Subclasses of Collection:

    • Array:
      • A subclass of SequenceableCollection (which is a subclass of Collection).
      • Elements are ordered and accessed by index.
      • Similar to String (which is also sequenceable).
    • Bag and Set:
      • Subclasses of Collection (described as non-sequenceable, potentially grouped under a NonSequenceableCollection concept, denoted NSC in the diagram).
      • Instance variables are anonymous (elements are not accessed by position/index in the same way as Array).
      • Do not maintain an order on their elements.
      • Difference:
        • Bag: May contain the same object multiple times (allows duplicates).
        • Set: Contains any given element only once (duplicates are ignored or prevented).
  • Figure 5: Class hierarchy of Collection classes Description:

    • Shows a partial class hierarchy diagram.
    • Object is the root.
    • Collection inherits from Object.
    • SequenceableCollection and Bag (NSC) inherit from Collection. (NSC likely stands for Non-Sequenceable Collection).
    • String and Array inherit from SequenceableCollection.
    • Set inherits from Bag (NSC) (or conceptually represents a non-sequenceable collection without duplicates). Note: Set inheriting from Bag is unusual; often they are siblings under a common collection superclass. However, sticking to the diagram’s implication.

2.5 Comparisons of OODBS & RDBS (Page 7)

2.5.1 Correspondence between OODBS and RDBS (Page 7, 8)

The following table shows an approximate correspondence. Concepts are not directly equivalent.

OODBS ConceptRDBS Concept
objecttuple (row)
instance variablecolumn, attribute
class hierarchydatabase scheme (is-a relation) *
collection classrelation (table)
OID (Object ID)key (typically primary key)
messageprocedure call
methodprocedure body

*Note: RDBMS don’t directly support inheritance in the schema definition the way OODBMS do via class hierarchies. Simulating is-a often involves separate tables and foreign keys or specific table-per-hierarchy strategies.

Important Caveat: This table represents only an approximate equivalence. The fundamental concepts and properties differ significantly between the two models (e.g., behavior encapsulation in OODBS vs. data-only tuples in RDBMS, object identity vs. value-based identity).

2.5.2 Comparison (Advantages & Disadvantages) (Page 8)

Advantages of OODBS (over RDBMS)Disadvantages of OODBS (compared to RDBMS)
Complex objects & relations: Naturally models complex data structures.Schema change: Non-trivial; often involves system-wide recompilation.
Class hierarchy: Supports inheritance, promoting reuse and extensibility.Lack of agreed upon standard: ODMG existed, but less universal than SQL.
No impedance mismatch: Seamless integration with OOPL.Lack of universal query language: No single dominant language like SQL.
No primary keys (OID-based): Object identity is inherent.Lack of Ad-Hoc query: Often less flexible/powerful ad-hoc querying tools.
One data model: Single model for application and database.Language dependence: Often tied to a specific OOPL (C++, Java, Smalltalk).
High performance on certain tasks: E.g., navigational access through complex object networks.Concurrency support: Historically, sometimes weaker support for many concurrent users compared to mature RDBMS.
Less programming effort: Due to inheritance, reuse, extensibility.
  • Rise of ORDBMS: Due to OODBS disadvantages (especially lack of standards, query language issues), Object-Relational DBMS (ORDBMS) became popular, adding object features to relational systems.
  • Future Outlook: Expect continued presence of:
    • OODBMS: Serving specialized markets needing deep object integration (CAD, GIS, etc.).
    • ORDBMS: Dominating traditional commercial markets, offering a blend of relational stability and object features.
  • Next Topic: The document indicates the following chapter will discuss indexing in OODBMS, specifically in GemStone.