Current location - Education and Training Encyclopedia - University rankings - Three paradigms of database?
Three paradigms of database?
The design paradigm of database is the specification that database design needs to meet. The database structure that meets these specifications is simple and clear, and there will be no abnormal operations such as insertion, deletion and update. On the contrary, it is a mess, which not only creates trouble for database programmers, but also looks disgusting and may store a lot of unnecessary redundant information.

Is the design paradigm difficult to understand? No, we were given a bunch of mathematical formulas in college textbooks. Of course, we can't understand them or remember them. So many of us didn't design the database according to the example at all.

In essence, design paradigm can be clearly expressed and understood in very vivid and concise language. This paper will explain the paradigms in a popular way, and take the database of a simple forum designed by the author as an example to illustrate how to apply these paradigms to practical projects.

Paradigm interpretation

First normal form (1NF): The fields in the database table are all single attributes and cannot be subdivided. This single attribute consists of basic types, including integer, real number, character, logic, date and so on.

For example, the following database tables conform to first normal form:

Field 1 Field 2 Field 3 Field 4

And such a database table does not conform to first normal form:

Field 1 field 2 field 3 field 4 field 3. 1 field 3.2

Obviously, in any current relational database management system (DBMS), it is impossible for a fool to make a database that does not conform to first normal form, because these DBMS do not allow one column of a database table to be divided into two or more columns. Therefore, it is impossible for you to design a database that does not conform to first normal form in the existing DBMS.

The second normal form (2NF): non-key fields have no partial functional dependence on any candidate key fields in the database table (partial functional dependence refers to the situation that some fields in the combined keywords determine the non-key fields), that is, all non-key fields are completely dependent on any group of candidate keys.

Assume that the course selection relationship table is SelectCourse (student number, name, age, course name, grades, credits), and the keywords are combined keywords (student number, course name), because there are the following decisive relationships:

(student number, course name) → (name, age, grades, credits)

This database table does not satisfy the second normal form, because there is the following decisive relationship:

(Course Name) → (Credit)

(Student ID) → (Name, age)

That is to say, there are cases where fields in combined keywords determine non-keywords.

Because it does not conform to 2NF, there will be the following problems in this course selection relationship table:

(1) Data redundancy:

The same course is selected by n students, and the "credit" is repeated n- 1 time; The same student took m classes, and his name and age were repeated m- 1 times.

(2) Update exception:

If the credits of a course are adjusted, the "credits" values of all rows in the data table should be updated, otherwise different credits of the same course will appear.

(3) Insert an exception:

Suppose you want to open a new course, and no one has learned it yet. In this way, because there is no "student number" keyword, the course name and credits cannot be recorded in the database.

(4) Delete the exception:

Assuming that a group of students have completed elective courses, these elective records should be deleted from the database table. But at the same time, the course name and credit information have also been deleted. Obviously, this will also lead to insertion exceptions.

Change the course selection relationship table to the following three tables:

Student: student (student number, name, age);

Courses: courses (course name, credits);

Course selection relationship: SelectCourse (student number, course name, grade).

Such a database table conforms to the second normal form, eliminating data redundancy, updating exceptions, inserting exceptions and deleting exceptions.

In addition, all single keyword database tables conform to the second normal form, because it is impossible to have combined keywords.

Third Normal Form (3NF): On the basis of the second normal form, if the non-key fields have no transfer function dependence on any candidate key fields, the data table conforms to the third normal form. The so-called transfer function dependence means that if there is a decisive relationship of "A → B → C", the transfer function of C depends on A ... Therefore, the database table satisfying the third normal form should not have the following dependencies:

Key field → Non-key field x → Non-key field Y

Suppose the student relationship table is a student (student number, name, age, college, college location, college phone number) and the keyword is a single keyword "student number", because there are the following decisive relationships:

(Student ID) → (Name, age, college, college location, college phone number) This database conforms to 2NF, but does not conform to 3NF, because there are the following decisive relations:

(Student ID) → (College )→ (College location, college phone number)

That is, the non-key fields "College Location" and "College Phone" depend on the transfer function of the key field "Student Number".

It will also have data redundancy, update exceptions, insert exceptions and delete exceptions, which readers can analyze by themselves.

Divide the student relationship table into the following two tables:

Student: (student number, name, age, college);

College: (College, location, telephone).

Such a database table conforms to the third normal form, eliminating data redundancy, updating exceptions, inserting exceptions and deleting exceptions.

Bowes-Cod normal form (BCNF): On the basis of the third normal form, if any field has no transfer function dependence on any candidate key field, the database table conforms to the third normal form.

Suppose that the relational table of warehouse management is StorehouseManage (warehouse ID, storage item ID, administrator ID, quantity), and one administrator only works in one warehouse; A warehouse can store all kinds of articles. There are the following decisive relationships in the database table:

(warehouse ID, storage item ID) → (administrator ID, quantity)

(Administrator ID, storage item ID) → (Warehouse ID, quantity)

Therefore, (warehouse ID, storage item ID) and (administrator ID, storage item ID) are candidate keys for StorehouseManage, and the only non-key field in the table is quantity, which conforms to the third normal form. However, due to the following decisive relationship:

(warehouse ID) → (administrator ID)

(Administrator ID) → (Warehouse ID)

That is, there is a situation that key fields determine key fields, so it does not conform to the BCNF paradigm. It will have the following exceptions:

(1) Delete exception:

When the warehouse is emptied, all the information of "storage item ID" and "quantity" are deleted, and at the same time, the information of "warehouse ID" and "administrator ID" are also deleted.

(2) Insert exception:

You cannot assign an administrator to a warehouse when nothing is stored in the warehouse.

(3) Update exception:

If the warehouse has a new administrator, the administrator ID of all rows in the table will be modified.

The warehouse management relational table is decomposed into two relational tables:

Warehouse management: StorehouseManage (warehouse ID, administrator ID);

Warehouse: warehouse (warehouse identification, storage item identification, quantity).

This database table conforms to the BCNF paradigm, eliminating deletion exceptions, insertion exceptions and update exceptions.

Example application

Let's get a forum database step by step, which contains the following information:

(1) User: user name, email, home page, phone number and contact address.

(2) Post: post title, post content, reply title and reply content.

For the first time, we designed the database as only tables:

User name, email home page, telephone contact address, post title, post content, reply title, reply content.

The database table conforms to first normal form, but no group of candidate key can determine the whole row of the database table, and the unique keyword field user name cannot completely determine the whole tuple. We need to add "Publish ID" and "Reply ID" fields, that is, modify the table to:

User name, email home page, telephone contact address, post ID, post title, post content reply ID, reply title reply content.

In this way, the keywords (user name, posting ID and reply ID) in the data table can determine the whole line:

(User name, posting ID, reply ID) → (email address, home page, phone number, contact address, posting title, posting content, reply title, reply content)

However, such a design does not conform to the second normal form, because there is the following decisive relationship:

(User Name) → (Email, Home Page, Phone Number, Contact Address)

(Posting ID) → (Posting title, posting content)

(reply ID) → (reply title, reply content)

In other words, some functions of non-key fields depend on candidate key fields. Obviously, this design will lead to a lot of data redundancy and abnormal operation.

We decompose the database table into (underlined keywords):

(1) User information: user name, email address, home page, telephone number and contact address.

(2) Post information: post ID, title and content.

(3) Reply information: reply ID, title and content.

(4) Release: user name, release ID.

(5) reply: posting ID, reply ID

This design meets the requirements of 1, 2,3 paradigm and BCNF paradigm, but is this design the best?

Not necessarily.

It is observed that the relationship between "user name" and "posting ID" in item 4 is 1: n, so we can merge "posting" into item 2 "posting information"; The relationship between "posting ID" and "reply ID" in item 5 "reply" is also 1: n, so we can merge "reply" into item 3 "reply information". This can reduce data redundancy to some extent. The new design is: (1) User information: user name, email address, home page, telephone number and contact address.

(2) Post information: user name, post ID, title and content.

(3) Reply information: posting ID, reply ID, title and content.

The database table 1 obviously meets the requirements of all paradigms;

The key field "posting ID" in Table 2 of the database has some functional dependencies on non-key fields "title" and "content", which does not meet the requirements of the second normal form, but this design will not lead to data redundancy and abnormal operation.

In database table 3, there are some non-key fields "title" and "content" that depend on the key field "reply ID", which does not meet the requirements of the second normal form, but similar to database table 2, this design will not lead to data redundancy and abnormal operation.

It can be seen that it is not necessary to meet the requirements of formal form. For the relationship of 1: n, when one side of 1 is merged with the other side of n, the other side of n no longer meets the second normal form, but this design is better!

For the relationship of M: N, one side of M or one side of N cannot be merged into the other side, which will lead to non-compliance with the requirements of the normal form, abnormal operation and data redundancy.

For the relationship of 1: 1, we can merge 1 on the left or 1 on the right to the other side. The design will not meet the requirements of the normal form, but it will not lead to abnormal operation and data redundancy.