Database Architecture


We are now in a position to provide a single picture (Figure 1.5) of the various components of a database system and the connections among them.
The architecture of a database system is greatly influenced by the underlying computer system on which the database system runs. Database systems can be centralized, or client-server, where one server machine executes work on behalf of multiple client machines. Database systems can also be designed to exploit par- allel computer architectures. Distributed databases span multiple geographically separated machines.
In Chapter 17 we cover the general structure of modern computer systems. Chapter 18 describes how various actions of a database, in particular query pro- cessing, can be implemented to exploit parallel processing. Chapter 19 presents a number of issues that arise in a distributed database, and describes how to deal with each issue. The issues include how to store data, how to ensure atomicity of transactions that execute at multiple sites, how to perform concurrency control, and how to provide high availability in the presence of failures. Distributed query processing and directory systems are also described in this chapter.



Most users of a database system today are not present at the site of the database system, but connect to it through a network. We can therefore differen- tiate between client machines, on which remote database users work, and server machines, on which the database system runs. 
Database applications are usually partitioned into two or three parts, as in Figure 1.6. In a two-tier architecture, the application resides at the client machine, where it invokes database system functionality at the server machine through query language statements. Application program interface standards like ODBC and JDBC are used for interaction between the client and the server.
In contrast, in a three-tier architecture, the client machine acts as merely a front end and does not contain any direct database calls. Instead, the client end communicates with an application server, usually through a forms interface. The application server in turn communicates with a database system to access data. The business logic of the application, which says what actions to carry out under what conditions, is embedded in the application server, instead of being distributed across multiple clients. Three-tier applications are more appropriate for large applications, and for applications that run on the World Wide Web. 
 

Transaction Management


   Often, several operations on the database form a single logical unit of work. An example is a funds transfer, as in Section 1.2, in which one department account (say A) is debited and another department account (say B) is credited. Clearly, it is essential that either both the credit and debit occur, or that neither occur. That is, the funds transfer must happen in its entirety or not at all. This all-or-none requirement is called atomicity. In addition, it is essential that the execution of the funds transfer preserve the consistency of the database. That is, the value of the sum of the balances of A and B must be preserved. This correctness requirement is called consistency. Finally, after the successful execution of a funds transfer, the new values of the balances of accounts A and B must persist, despite the possibility of system failure. This persistence requirement is called durability.
A transaction is a collection of operations that performs a single logical function in a database application. Each transaction is a unit of both atomicity and consistency. Thus, we require that transactions do not violate any database- consistency constraints. That is, if the database was consistent when a transaction started, the database must be consistent when the transaction successfully ter- minates. However, during the execution of a transaction, it may be necessary temporarily to allow inconsistency, since either the debit of A or the credit of B must be done before the other. This temporary inconsistency, although necessary, may lead to difficulty if a failure occurs.


It is the programmer’s responsibility to define properly the various transac- tions, so that each preserves the consistency of the database. For example, the transaction to transfer funds from the account of department A to the account of department B could be defined to be composed of two separate programs: one that debits account A, and another that credits account B. The execution of these two programs one after the other will indeed preserve consistency. However, each program by itself does not transform the database from a consistent state to a new consistent state. Thus, those programs are not transactions.
Ensuring the atomicity and durability properties is the responsibility of the database system itself specifically, of the recovery manager. In the absence of failures, all transactions complete successfully, and atomicity is achieved easily. However, because of various types of failure, a transaction may not always com- plete its execution successfully. If we are to ensure the atomicity property, a failed transaction must have no effect on the state of the database. Thus, the database must be restored to the state in which it was before the transaction in question started executing. The database system must therefore perform failure recovery, that is, detect system failures and restore the database to the state that existed prior to the occurrence of the failure.
Finally, when several transactions update the database concurrently, the con- sistency of data may no longer be preserved, even though each individual transac- tion is correct. It is the responsibility of the concurrency-control manager to con- trol the interaction among the concurrent transactions, to ensure the consistency of the database. The transaction manager consists of the concurrency-control manager and the recovery manager.
The basic concepts of transaction processing are covered in Chapter 14. The management of concurrent transactions is covered in Chapter 15. Chapter 16 covers failure recovery in detail.
The concept of a transaction has been applied broadly in database systems and applications. While the initial use of transactions was in financial applica- tions, the concept is now used in real-time applications in telecommunication, as well as in the management of long-duration activities such as product design or administrative workflows. These broader applications of the transaction concept are discussed in Chapter 26. 

Data Storage and Querying


A database system is partitioned into modules that deal with each of the re- sponsibilities of the overall system. The functional components of a database system can be broadly divided into the storage manager and the query processor components.
The storage manager is important because databases typically require a large amount of storage space. Corporate databases range in size from hundreds of gigabytes to, for the largest databases, terabytes of data. A gigabyte is approxi- mately 1000 megabytes (actually 1024) (1 billion bytes), and a terabyte is 1 million megabytes (1 trillion bytes). Since the main memory of computers cannot store this much information, the information is stored on disks. Data are moved be- tween disk storage and main memory as needed. Since the movement of data to and from disk is slow relative to the speed of the central processing unit, it is imperative that the database system structure the data so as to minimize the need to move data between disk and main memory.
The query processor is important because it helps the database system to simplify and facilitate access to data. The query processor allows database users to obtain good performance while being able to work at the view level and not be burdened with understanding the physical-level details of the implementation of the system. It is the job of the database system to translate updates and queries written in a nonprocedural language, at the logical level, into an efficient sequence of operations at the physical level.


1.7.1 Storage Manager
The storage manager is the component of a database system that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. The storage manager is respon- sible for the interaction with the file manager. The raw data are stored on the disk using the file system provided by the operating system. The storage man- ager translates the various DML statements into low-level file-system commands.

Thus, the storage manager is responsible for storing, retrieving, and updating data in the database.
The storage manager components include:
  • Authorization and integrity manager, which tests for the satisfaction of integrity constraints and checks the authority of users to access data.
  • Transaction manager, which ensures that the database remains in a consis- tent (correct) state despite system failures, and that concurrent transaction executions proceed without conflicting.
  • Filemanager,whichmanagestheallocationofspaceondiskstorageandthe data structures used to represent information stored on disk.
  • Buffermanager,whichisresponsibleforfetchingdatafromdiskstorageinto main memory, and deciding what data to cache in main memory. The buffer manager is a critical part of the database system, since it enables the database to handle data sizes that are much larger than the size of main memory.
    The storage manager implements several data structures as part of the phys- ical system implementation:
    • Datafiles,whichstorethedatabaseitself.
    • Data dictionary, which stores metadata about the structure of the database,
      in particular the schema of the database.
    • Indices, which can provide fast access to data items. Like the index in this textbook, a database index provides pointers to those data items that hold a particular value. For example, we could use an index to find the instructor record with a particular ID, or all instructor records with a particular name. Hashing is an alternative to indexing that is faster in some but not all cases.
      We discuss storage media, file structures, and buffer management in Chapter 10. Methods of accessing data efficiently via indexing or hashing are discussed in Chapter 11.
1.7.2 The Query Processor
The query processor components include:
  • DDL interpreter,whichinterpretsDDLstatementsandrecordsthedefinitions in the data dictionary.
  • DML compiler,whichtranslatesDMLstatementsinaquerylanguageintoan evaluation plan consisting of low-level instructions that the query evaluation engine understands.

    A query can usually be translated into any of a number of alternative evaluation plans that all give the same result. The DML compiler also performs query optimization; that is, it picks the lowest cost evaluation plan from among the alternatives.
    Query evaluation engine, which executes low-level instructions generated by the DML compiler.
    Query evaluation is covered in Chapter 12, while the methods by which the query optimizer chooses from among the possible evaluation strategies are discussed in Chapter 13. 

Database Design


Database systems are designed to manage large bodies of information. These large bodies of information do not exist in isolation. They are part of the operation of some enterprise whose end product may be information from the database or may be some device or service for which the database plays only a supporting role.
Database design mainly involves the design of the database schema. The design of a complete database application environment that meets the needs of the enterprise being modeled requires attention to a broader set of issues. In this text, we focus initially on the writing of database queries and the design of database schemas. Chapter 9 discusses the overall process of application design.

1.6.1 Design Process
A high-level data model provides the database designer with a conceptual frame- work in which to specify the data requirements of the database users, and how the database will be structured to fulfill these requirements. The initial phase of database design, then, is to characterize fully the data needs of the prospective database users. The database designer needs to interact extensively with domain experts and users to carry out this task. The outcome of this phase is a specification of user requirements.
Next, the designer chooses a data model, and by applying the concepts of the chosen data model, translates these requirements into a conceptual schema of the database. The schema developed at this conceptual-design phase provides a detailed overview of the enterprise. The designer reviews the schema to confirm that all data requirements are indeed satisfied and are not in conflict with one another. The designer can also examine the design to remove any redundantfeatures. The focus at this point is on describing the data and their relationships, rather than on specifying physical storage details.
In terms of the relational model, the conceptual-design process involves de- cisions on what attributes we want to capture in the database and how to group these attributes to form the various tables. The whatpart is basically a business decision, and we shall not discuss it further in this text. The howpart is mainly a computer-science problem. There are principally two ways to tackle the problem. The first one is to use the entity-relationship model (Section 1.6.3); the other is to employ a set of algorithms (collectively known as normalization) that takes as input the set of all attributes and generates a set of tables (Section 1.6.4).
A fully developed conceptual schema indicates the functional requirements of the enterprise. In a specification of functional requirements, users describe the kinds of operations (or transactions) that will be performed on the data. Example operations include modifying or updating data, searching for and retrieving specific data, and deleting data. At this stage of conceptual design, the designer can review the schema to ensure it meets functional requirements.
The process of moving from an abstract data model to the implementation of the database proceeds in two final design phases. In the logical-design phase, the designer maps the high-level conceptual schema onto the implementation data model of the database system that will be used. The designer uses the resulting system-specific database schema in the subsequent physical-design phase, in which the physical features of the database are specified. These features include the form of file organization and the internal storage structures; they are discussed in Chapter 10.

1.6.2 Database Design for a University Organization
To illustrate the design process, let us examine how a database for a university could be designed. The initial specification of user requirements may be based on interviews with the database users, and on the designer’s own analysis of the organization. The description that arises from this design phase serves as the basis for specifying the conceptual structure of the database. Here are the major characteristics of the university.
The university is organized into departments. Each department is identified
by a unique name (dept budget.
name), is located in a particular building, and has a Eachdepartmenthasalistofcoursesitoffers.Eachcoursehasassociatedwith
page43image21952
it a course prerequisites.
id, title, dept
Instructors are identified by their unique ID. Each instructor has name, asso-
name, and credits, and may also have have associated
page43image24712 page43image24880
ciated department (dept
StudentsareidentifiedbytheiruniqueID.Eachstudenthasaname,anassoci-
name), and salary.
page43image26288
ated major department (dept earned thus far).
name), and tot cred (total credit hours the student 

The university maintains a list of classrooms, specifying the name of the
building, room
Theuniversitymaintainsalistofallclasses(sections)taught.Eachsectionis
number, and room capacity.
page44image2952
identified by a course
a semester, year, building, room class meets).
id, sec
number
, and time
id, year, and semester, and has associated with it slot id (the time slot when the
  • The department has a list of teaching assignments specifying, for each in- structor, the sections the instructor is teaching.
  • The university has a list of all student course registrations, specifying, for each student, the courses and the associated sections that the student has taken (registered for).
    A real university database would be much more complex than the preceding design. However we use this simplified model to help you understand conceptual ideas without getting lost in details of a complex design.

1.6.3 The Entity-Relationship Model

The entity-relationship (E-R) data model uses a collection of basic objects, called entities, and relationships among these objects. An entity is a thingor objectin the real world that is distinguishable from other objects. For example, each person is an entity, and bank accounts can be considered as entities.
Entities are described in a database by a set of attributes. For example, the
attributes dept
in a university, and they form attributes of the department entity set. Similarly, attributes ID, name, and salary may describe an instructor entity.2
name, building, and budget may describe one particular department
page44image17584
The extra attribute ID is used to identify an instructor uniquely (since it may be possible to have two instructors with the same name and the same salary). A unique instructor identifier must be assigned to each instructor. In the United States, many organizations use the social-security number of a person (a unique number the U.S. government assigns to every person in the United States) as a unique identifier.
A relationship is an association among several entities. For example, a member relationship associates an instructor with her department. The set of all entities of the same type and the set of all relationships of the same type are termed an entity set and relationship set, respectively.
The overall logical structure (schema) of a database can be expressed graph- ically by an entity-relationship (E-R) diagram. There are several ways in which to draw these diagrams. One of the most popular is to use the Unified Modeling Language (UML). In the notation we use, which is based on UML, an E-R diagram is represented as follows: 

Entity sets are represented by a rectangular box with the entity set name in
the header and the attributes listed below it.

Relationship sets are represented by a diamond connecting a pair of related entity sets. The name of the relationship is placed inside the diamond.
As an illustration, consider part of a university database consisting of instruc- tors and the departments with which they are associated. Figure 1.3 shows the corresponding E-R diagram. The E-R diagram indicates that there are two entity sets, instructor and department, with attributes as outlined earlier. The diagram also shows a relationship member between instructor and department.
In addition to entities and relationships, the E-R model represents certain constraints to which the contents of a database must conform. One important constraint is mapping cardinalities, which express the number of entities to which another entity can be associated via a relationship set. For example, if each instructor must be associated with only a single department, the E-R model can express that constraint.
The entity-relationship model is widely used in database design, and Chapter 7 explores it in detail.

1.6.4 Normalization
Another method for designing a relational database is to use a process commonly known as normalization. The goal is to generate a set of relation schemas that allows us to store information without unnecessary redundancy, yet also allows us to retrieve information easily. The approach is to design schemas that are in an appropriate normal form. To determine whether a relation schema is in one of the desirable normal forms, we need additional information about the real-world enterprise that we are modeling with the database. The most common approach is to use functional dependencies, which we cover in Section 8.4.
To understand the need for normalization, let us look at what can go wrong in a bad database design. Among the undesirable properties that a bad design may have are:
Repetition of information
Inability to represent certain information 

We shall discuss these problems with the help of a modified database design for our university example.
Suppose that instead of having the two separate tables instructor and depart- ment, we have a single table, faculty, that combines the information from the two tables (as shown in Figure 1.4). Notice that there are two rows in faculty that contain repeated information about the History department, specifically, that department’s building and budget. The repetition of information in our alterna- tive design is undesirable. Repeating information wastes space. Furthermore, it complicates updating the database. Suppose that we wish to change the budget amount of the History department from $50,000 to $46,800. This change must be reflected in the two rows; contrast this with the original design, where this requires an update to only a single row. Thus, updates are more costly under the alternative design than under the original design. When we perform the update in the alternative database, we must ensure that every tuple pertaining to the His- tory department is updated, or else our database will show two different budget values for the History department.
Now, let us shift our attention to the issue of inability to represent certain information.Suppose we are creating a new department in the university. In the alternative design above, we cannot represent directly the information concerning
a department (dept
instructor at the university. This is because rows in the faculty table require values for ID, name, and salary. This means that we cannot record information about the newly created department until the first instructor is hired for the new department.
One solution to this problem is to introduce null values. The null value indicates that the value does not exist (or is not known). An unknown value may be either missing (the value does exist, but we do not have that information) or not known (we do not know whether or not the value actually exists). As we shall see later, null values are difficult to handle, and it is preferable not to resort to them. If we are not willing to deal with null values, then we can create a particular item of department information only when the department has at least one instructor associated with the department. Furthermore, we would have to delete this information when the last instructor in the department departs. Clearly, this situation is undesirable, since, under our original database design, the department information would be available regardless of whether or not there is an instructor associated with the department, and without resorting to null values.
An extensive theory of normalization has been developed that helps formally define what database designs are undesirable, and how to obtain desirable de- signs. Chapter 8 covers relational-database design, including normalization. 

Relational Databases


A relational database is based on the relational model and uses a collection of tables to represent both data and the relationships among those data. It also in- cludes a DML and DDL. In Chapter 2 we present a gentle introduction to the fundamentals of the relational model. Most commercial relational database sys- tems employ the SQL language, which we cover in great detail in Chapters 3, 4, and 5. In Chapter 6 we discuss other influential languages.



1.5.1 Tables
Each table has multiple columns and each column has a unique name. Figure 1.2 presents a sample relational database comprising two tables: one shows details of university instructors and the other shows details of the various university departments.
The first table, the instructor table, shows, for example, that an instructor named Einstein with ID 22222 is a member of the Physics department and has an annual salary of $95,000. The second table, department, shows, for example, that the Biology department is located in the Watson building and has a budget of $90,000. Of course, a real-world university would have many more departments and instructors. We use small tables in the text to illustrate concepts. A larger example for the same schema is available online.
The relational model is an example of a record-based model. Record-based models are so named because the database is structured in fixed-format records of several types. Each table contains records of a particular type. Each record type defines a fixed number of fields, or attributes. The columns of the table correspond to the attributes of the record type.
It is not hard to see how tables may be stored in files. For instance, a special character (such as a comma) may be used to delimit the different attributes of a record, and another special character (such as a new-line character) may be used to delimit records. The relational model hides such low-level implementation details from database developers and users.
We also note that it is possible to create schemas in the relational model that have problems such as unnecessarily duplicated information. For example, sup- pose we store the department budget as an attribute of the instructor record. Then, whenever the value of a particular budget (say that one for the Physics depart- ment) changes, that change must to be reflected in the records of all instructors associated with the Physics department. In Chapter 8, we shall study how to distinguish good schema designs from bad schema designs.


1.5.2 Data-Manipulation Language
The SQL query language is nonprocedural. A query takes as input several tables (possibly only one) and always returns a single table. Here is an example of an SQL query that finds the names of all instructors in the History department:
select instructor.name from instructor where instructor.dept
name = ’History’;
page40image50488
The query specifies that those rows from the table instructor where the dept History must be retrieved, and the name attribute of these rows must be displayed. More specifically, the result of executing this query is a table with a single column labeled name, and a set of rows, each of which contains the name of an instructor
whose dept
will consist of two rows, one with the name El Said and the other with the name Califieri.
name, is History. If the query is run on the table in Figure 1.2, the result
page41image4312
Queries may involve information from more than one table. For instance, the following query finds the instructor ID and department name of all instructors associated with a department with budget of greater than $95,000.
select instructor.ID, department.dept from instructor, department
name name= department.dept
page41image7648
where instructor.dept department.budget > 95000;
If the above query were run on the tables in Figure 1.2, the system would find that there are two departments with budget of greater than $95,000—Computer Science and Finance; there are five instructors in these departments. Thus, the
name and
page41image11000 page41image11168
result will consist of a table with two columns (ID, dept
(12121, Finance), (45565, Computer Science), (10101, Computer Science), (83821, Computer Science), and (76543, Finance).

1.5.3 Data-Definition Language
SQL provides a rich DDL that allows one to define tables, integrity constraints, assertions, etc.
For instance, the following SQL DDL statement defines the department table: create table department
name) and five rows:
page41image15656
(dept building budget
char (20),
char (15), numeric (12,2));
name
page41image17768
Execution of the above DDL statement creates the department table with three
columns: dept
associated with it. We discuss data types in more detail in Chapter 3. In addition, the DDL statement updates the data dictionary, which contains metadata (see Section 1.4.2). The schema of a table is an example of metadata.

1.5.4 Database Access from Application Programs
SQL is not as powerful as a universal Turing machine; that is, there are some computations that are possible using a general-purpose programming language but are not possible using SQL. SQL also does not support actions such as input from users, output to displays, or communication over the network. Such com- putations and actions must be written in a host language, such as C, C++, or Java, with embedded SQL queries that access the data in the database. Application programs are programs that are used to interact with the database in this fashion. 

Examples in a university system are programs that allow students to register for courses, generate class rosters, calculate student GPA, generate payroll checks, etc. To access the database, DML statements need to be executed from the host
language. There are two ways to do this:
  • By providing an application program interface (set of procedures) that can be used to send DML and DDL statements to the database and retrieve the results.
    The Open Database Connectivity (ODBC) standard for use with the C language is a commonly used application program interface standard. The Java Database Connectivity (JDBC) standard provides corresponding features to the Java language.
  • By extending the host language syntax to embed DML calls within the host language program. Usually, a special character prefaces DML calls, and a preprocessor, called the DML precompiler, converts the DML statements to normal procedure calls in the host language. 
  

Database Languages


A database system provides a data-definition language to specify the database schema and a data-manipulation language to express database queries and up dates. In practice, the data-definition and data-manipulation languages are not two separate languages; instead they simply form parts of a single database language, such as the widely used SQL language.

1.4.1 Data-Manipulation Language
A data-manipulation language (DML) is a language that enables users to access or manipulate data as organized by the appropriate data model. The types of access are:
Retrieval of information stored in the database
Insertion of new information into the database
Deletion of information from the database
Modification of information stored in the database
There are basically two types:
  • Procedural DMLs require a user to specify what data are needed and how to get those data.
  • DeclarativeDMLs(alsoreferredtoasnonproceduralDMLs)requireauserto specify what data are needed without specifying how to get those data.
    Declarative DMLs are usually easier to learn and use than are procedural DMLs. However, since a user does not have to specify how to get the data, the database system has to figure out an efficient means of accessing data.
    A query is a statement requesting the retrieval of information. The portion of a DML that involves information retrieval is called a query language. Although technically incorrect, it is common practice to use the terms query language and data-manipulation language synonymously.
    There are a number of database query languages in use, either commercially or experimentally. We study the most widely used query language, SQL, in Chap- ters 3, 4, and 5. We also study some other query languages in Chapter 6.
    The levels of abstraction that we discussed in Section 1.3 apply not only to defining or structuring data, but also to manipulating data. At the physical level, we must define algorithms that allow efficient access to data. At higher levels of abstraction, we emphasize ease of use. The goal is to allow humans to interact efficiently with the system. The query processor component of the database system (which we study in Chapters 12 and 13) translates DML queries into sequences of actions at the physical level of the database system.
1.4.2 Data-Definition Language
We specify a database schema by a set of definitions expressed by a special language called a data-definition language (DDL). The DDL is also used to specify additional properties of the data. 

View of Data


At the logical level, each such record is described by a type definition, as in the previous code segment, and the interrelationship of these record types is defined as well. Programmers using a programming language work at this level of abstraction. Similarly, database administrators usually work at this level of abstraction.
Finally, at the view level, computer users see a set of application programs that hide details of the data types. At the view level, several views of the database are defined, and a database user sees some or all of these views. In addition to hiding details of the logical level of the database, the views also provide a security mechanism to prevent users from accessing certain parts of the database. For example, clerks in the university registrar office can see only that part of the database that has information about students; they cannot access information about salaries of instructors.

1.3.2 Instances and Schemas

Databases change over time as information is inserted and deleted. The collection of information stored in the database at a particular moment is called an instance of the database. The overall design of the database is called the database schema. Schemas are changed infrequently, if at all.
The concept of database schemas and instances can be understood by analogy to a program written in a programming language. A database schema corresponds to the variable declarations (along with associated type definitions) in a program. Each variable has a particular value at a given instant. The values of the variables in a program at a point in time correspond to an instance of a database schema.
Database systems have several schemas, partitioned according to the levels of abstraction. The physical schema describes the database design at the physical level, while the logical schema describes the database design at the logical level. A database may also have several schemas at the view level, sometimes called subschemas, that describe different views of the database.
Of these, the logical schema is by far the most important, in terms of its effect on application programs, since programmers construct applications by using the logical schema. The physical schema is hidden beneath the logical schema, and can usually be changed easily without affecting application programs. Application programs are said to exhibit physical data independence if they do not depend on the physical schema, and thus need not be rewritten if the physical schema changes.
We study languages for describing schemas after introducing the notion of data models in the next section.

1.3.3 Data Models
Underlying the structure of a database is the data model: a collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints. A data model provides a way to describe the design of a database at the physical, logical, and view levels. 

There are a number of different data models that we shall cover in the text.
The data models can be classified into four different categories:
Relational Model. The relational model uses a collection of tables to repre- sent both data and the relationships among those data. Each table has mul- tiple columns, and each column has a unique name. Tables are also known as relations. The relational model is an example of a record-based model. Record-based models are so named because the database is structured in fixed-format records of several types. Each table contains records of a par- ticular type. Each record type defines a fixed number of fields, or attributes. The columns of the table correspond to the attributes of the record type. The relational data model is the most widely used data model, and a vast major- ity of current database systems are based on the relational model. Chapters 2 through 8 cover the relational model in detail.
Entity-Relationship Model. The entity-relationship (E-R) data model uses a collection of basic objects, called entities, and relationships among these objects. An entity is a thingor objectin the real world that is distinguishable from other objects. The entity-relationship model is widely used in database design, and Chapter 7 explores it in detail.
Object-BasedDataModel.Object-orientedprogramming(especiallyinJava, C++, or C#) has become the dominant software-development methodology. This led to the development of an object-oriented data model that can be seen as extending the E-R model with notions of encapsulation, methods (functions), and object identity. The object-relational data model combines features of the object-oriented data model and relational data model. Chap- ter 22 examines the object-relational data model.
Semistructured Data Model. The semistructured data model permits the specification of data where individual data items of the same type may have different sets of attributes. This is in contrast to the data models mentioned earlier, where every data item of a particular type must have the same set of attributes. The Extensible Markup Language (XML) is widely used to represent semistructured data. Chapter 23 covers it.
Historically, the network data model and the hierarchical data model pre- ceded the relational data model. These models were tied closely to the underlying implementation, and complicated the task of modeling data. As a result they are used little now, except in old database code that is still in service in some places. They are outlined online in Appendices D and E for interested readers.