Abstract
Functional dependencies and Normalization play an important role in relational database design. Functional dependencies are key to establishing the relationship between key and non-key attributes in a relation. Normalization process works towards removing anomalies in a relation and prevents data redundancy. This paper, intended to be a graduate research paper, establishes the definitions of these concepts. The paper introduces Functional Dependency and explains its inference rules. The paper also introduces Normalization and various normal forms 1NF thru 5NF including the BCNF. The paper also explains how Functional Dependencies and Normalization are related, why they are important with regards to relational databases, and advantages of designing a normalized database.
Relational Databases: Functional Dependency and Normalization
Definitions and Concepts
Functional Dependency
A functional dependency is a constraint between two sets of attributes from the database. A functional dependency, represented by X → Y, between two sets of attributes X and Y that are subsets of a relation R specifies a constraint that, for any two tuples t1 and t2 in R that have t1[X] = t2[X],they must also have t1[Y] = t2[Y].
This means the values of the set of attributes Y of a tuple in R are determined by the set of attributes X. In other words, the values of the set of attributes X functionally determine the values of the set of attributes Y. We can also say that Y is functionally dependent on Y.
The set of attributes X on the left-hand side of the functional dependency is called determinant and the set of attributes Y on the right-hand side of the functional dependency is called dependent. Despite the mathematical definition a functional dependency cannot be determined automatically. It is a property of the semantics of attributes – the database designers will have to understand how the attributes are related to each other to specify a functional dependency. (Elmasri, Ramez and Shamkant B. Navathe. 2006)
Example
Consider the example of an SSN (Social Security Number) database. Every individual has a unique SSN. So, the other attributes of the relation, like name, address etc. can be determined using the SSN. That makes SSN a determinant, and name, address, the dependents – thus establishing the functional dependencies:
SSN name
SSN address
Inference Rules for Functional Dependency
Armstrong’s axioms are a set of inference rules used to infer all the functional dependencies on a relational database. They were developed by William W. Armstrong.
Axiom of reflexivity: if Y is a subset of X, then X determines Y
If Y is a subset of X then X Y
Axiom of augmentation: if X determines Y, then XZ determines YZ for any Z
If X Y, then XZ YZ
Axiom of transitivity: if X determines Y and Y determines Z, then X must determine Z
If X Y and Y Z, then X Z
Union: if X determines Y and X determines Z then X must also determine Y and Z
If X Y and X Z, then X YZ
Decomposition: if X determines Y and Z, then X determines Y and X determines Z separately
If X YZ, then X Y and X Z
Normalization
Database Normalization is a process that allows the storage of data without unnecessary redundancy and thereby eliminate data inconsistency. A normalized database eliminates anomalies in updating, inserting, and deleting data, which improves the efficiency and effectiveness of the database. Users can maintain and retrieve data from a normalized database without difficulty. Data Normalization can be used by the designer of a database to identify and group together related data elements (attributes) and establish relationships between the groups.
Database Normalization concept and its ‘Normal Forms’ were originally invented by Edgar Codd, the inventor of the relational model. The ‘Normal Forms’ provide the criteria for determining a table’s degree of vulnerability to logical inconsistencies and anomalies. The higher the normal form applicable to a table, the less vulnerable it is.
First Normal Form (1NF)
An entity type or table is in 1NF when each of its attributes contain simple values which are atomic and contains no repeating groups of data. The domain of an attribute in an 1NF table must include only atomic (simple, indivisible) values and that the value of any attribute in a tuple must be a single value from the domain of that attribute.
Example
Consider the address attribute in a sales database. It is not an atomic attribute, because it is made up of atomic attributes as street, city, state and zip. For the relation to be in 1NF, the appropriate database design should have the atomic attributes street, city, state and zip instead of an address attribute.
Un-Normalized: sales (date, order_no, product_no, product_description, price, quantity_sold, cust_name, cust_address)
1NF: sales (date, order_no, product_no, product_description, price, quantity_sold, cust_name, cust_street, cust_city, cust_state, cust_zip)
Second Normal Form (2NF)
An entity type or table is in 2NF when it is in 1NF and all its non-key attributes depend on the whole key (i.e., functional dependency). There cannot be partial dependencies.
Example
Continuing with the sales database, the order_no and the product_no form the composite key for the table. There are partial dependencies – date is dependent on order_no, but not product_no – which violates the requirement for 2NF. The product_description is dependent on product_no and not on order_no. Removing these partial dependencies will result in 2NF.
1NF: sales (date, order_no, product_no, product_description, price, quantity_sold, customer_name, customer_street, customer_city, customer_state, customer_zip)
2NF: order (date, order_no, cust_no);
product (product_no, product_description, price);
order_detail (order_no, product_no, quantity_sold);
customer (cust_no, cust_name, cust_street, cust_city, cust_state, cust_zip)
Third Normal Form (3NF)
An entity type or table is in 3NF when it is in 2NF and non-key attributes do not depend on other non-key attributes (i.e., there is no transitive dependency).
Example
Continuing with the sales database, the non-key attributes cust_city and cust_state are dependent on cust_zip which is a non-key attribute. Creating a separate zip table will transform the design into 3NF, where in there are no more dependencies between non-key attributes.
2NF: order (date, order_no, cust_no);
product (product_no, product_description, price);
order_detail (order_no, product_no, quantity_sold);
customer (cust_no, cust_name, cust_street, cust_city, cust_state, cust_zip)
3NF: order (date, order_no, cust_no);
product (product_no, product_description, price);
order_detail (order_no, product_no, quantity_sold);
customer (cust_no, cust_name, cust_street, zip_code);
zip (zip_code, city, state)
Boyce Codd Normal Form (BCNF)
An entity type or table is in BCNF when it is in 3NF and all candidate keys defined for the relation satisfy the test for third normal form.
Example
Continuing with the sales database, all the candidate keys already satisfy the 3NF requirements.
Fourth Normal Form (4NF)
An entity type or table is in 4NF when it is in BCNF and there are no non-trivial multi-valued dependencies. To move from BCNF to 4NF, remove any independently multi-valued components of the primary key to two new parent entities.
Example
For example, a professor can teach multiple subjects and can also mentor multiple students. To be in 4NF, the professor to subjects should be a separate relation and professor to students should be a separate relation – since they are independent of each other
Fifth Normal Form (5NF)
To be in 5NF, a relation decomposed into two relations must have lossless-join property, which ensures that no spurious tuples are generated when relations are reunited through a natural join.
Example
In the sales database example, when the sales database was split into order and product, the natural join of those two tables does not result in loss of data (tuples).
(Russell, Gordon. Chapter 4; Nguyen Kim Anh, Relational Design Theory)
Importance of Functional Dependency and Normalization to Relational Model
How are they related
Normalization theory draws heavily on the theory of functional dependencies. When a database designer sets out to design a database, it is essential to understand the semantics of the data – how the attributes are related to one another. This helps in establishing the functional dependencies between attributes. Once the functional dependencies are identified, the design the database in to a ‘normal form’ of the highest order possible is easier. Rules for each normal form, starting from the 1NF are invariably framed around maintaining the functional dependencies and are also based on the inference rules for functional dependencies (refer Inference Rules section). For example, to be in 2NF the non-key attributes should be dependent on the whole-key, which means the functional dependencies should be satisfied. Similarly, to be in 3NF, transitive dependency should be removed, which can be done if the functional dependencies are established correctly.
In other words, database normalization process ensures an efficient organization of data in database tables, which results in guaranteeing that data dependencies make sense, and also reducing the space occupied by the database via eliminating redundant data.
Why are they necessary for Relational Database model?
Functional dependencies play an important role in relational database design. They are used to establish keys that are used to define normal forms for relations. In addition, they help in deriving constraints based on the relationships between attributes. As a database grows in size and complexity it is essential that order and organization be maintained to control these complexities and minimize errors and redundancy in the associated data. This goal is managed by normalization. Database normalization minimizes data duplication to safeguard databases against logical and structural problems, such as data anomalies.
Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service
Normalization can help keep the data free of errors and can also help ensure that the size of the database doesn’t grow large with duplicated data. Normalization permits us to design our relational database tables so that they “(1) contain all the data necessary for the purposes that the database is to serve, (2) have as little redundancy as possible, (3) accommodate multiple values for types of data that require them, (4) permit efficient updates of the data in the database, and (5) avoid the danger of losing data unknowingly (Wyllys, R. E., 2002).”
The resulting normalized database is highly efficient, which can be characterized by –
Increased Consistency: Information is stored in one place and one place only, reducing the possibility of inconsistent data.
Easier object-to-data mapping: Highly-normalized data schemas in general are closer conceptually to object-oriented schemas because the object-oriented goals of promoting high cohesion and loose coupling between classes results in similar solutions.
Moreover, a normalized database is advantageous when operations will be write-intensive or when ACID (Atomicity, Consistency, Isolation, Durability) compliance is required. Some advantages include:
Updates run quickly since no data being duplicated in multiple locations.
Inserts run quickly since there is only a single insertion point for a piece of data and no duplication is required.
Tables are typically smaller than the tables found in non-normalized databases. This usually allows the tables to fit into the buffer, thus offering faster performance.
Data integrity and consistency is an absolute must if the database must be ACID compliant. A normalized database helps immensely with such an undertaking.
Searching, sorting, and creating indexes can be faster, since tables are narrower, and more rows fit on a data page.
Minimizes/avoids data modification issues.
(https://en.wikipedia.org/wiki/ACID_(computer_science))
Summary
The paper defined the concept of functional dependency, which is the basic tool for analyzing relational schemas, and discussed some of its properties. Functional dependencies specify semantic constraints among the attributes of a relation schema. Next it described the normalization process for achieving good designs It presented examples to illustrate how by using the general definition of the normal forms, a given relation may be analyzed and decomposed to eventually yield a set of relations in 3NF. The paper also touches not often used BCNF, 4NF and 5NF normal forms.
Then the paper explains how functional dependencies and normalization are inter-related in the design of a relational model database. It explains the importance of functional dependency and normalization in the design of a relational database. A normalized database is highly efficient and has many advantages.
References
Wyllys, R. E., 2002. Database management principles and applications
Elmasri, Ramez and Shamkant B. Navathe. 2006. Fundamentals of Database Systems. 5th ed. Reading, MA: Addison-Wesley
Russell, Gordon. Chapter 4 – Normalization. Database eLearning
Nguyen Kim Anh, Relational Design Theory. OpenStax CNX
Gaikwad, A.S., Kadri, F.A., Khandagle, S.S., Tava, N.I. (2017) Review on Automation Tool for ERD Normalization. International Research Journal of Engineering and Technology (IRJET) [Online]. 4 (2), pp. 1323-1325. [Accessed 07 May 2017]. Available from: https://www.irjet.net/archives/V4/i2/IRJET-V4I2259.pdf
https://en.wikipedia.org/wiki/ACID_(computer_science)
Tables
Un-normalized Table: sales
sales
date
order_no
cust_no
product_description
price
quantity_sold
cust_name
cust_address
12/12/2018
1001
A320
MP3
10.00
8
Tom
1 Main St, Hartford, CT 06106
12/12/2015
1001
B101
Ipod
100.00
4
Tom
1 Main St, Hartford, CT 06106
01/05/2019
1002
C101
Blu Ray
80.00
3
Aaron
1 Holy Lane, Manchester, 06040
1NF Table: sales
sales
date
order_no
product_no
product_description
price
quantity_sold
cust_name
….
12/12/2018
1001
A320
MP3
10.00
8
Tom
12/12/2015
1001
B101
Ipod
100.00
4
Tom
01/05/2019
1002
C101
Blu Ray
80.00
3
Aaron
Continued..
sales
….
cust_street
cust_city
cust_state
cust_zip
1 Main St
Hartford
CT
06106
1 Main St
Hartford
CT
06106
1 Holy Lane
Manchester
CT
06040
2NF Table: order
order
date
order_no
cust_no
12/12/2018
1001
101
01/05/2019
1002
102
2NF Table: product
product
product_no
product_description
price
A320
MP3
10.00
B101
Ipod
100.00
C101
Blu Ray
80.00
2NF Table: order_detail
order_detail
order_no
product_no
quantity_sold
1001
A320
8
1001
B101
4
1002
C101
3
2NF Table: customer
customer
cust_no
cust_name
cust_street
cust_city
cust_state
cust_zip
101
Tom
1 Main Street
Hartford
CT
06106
102
Aaron
1 Holy Lane
Manchester
CT
06040
3NF Table: customer
customer
cust_no
cust_name
cust_street
zip_code
101
Tom
1 Main Street
06106
102
Aaron
1 Holy Lane
06040
3NF Table: zip
zip
zip_code
city
State
06106
Hartford
CT
06040
Manchester
CT
Essay Writing Service Features
Our Experience
No matter how complex your assignment is, we can find the right professional for your specific task. Contact Essay is an essay writing company that hires only the smartest minds to help you with your projects. Our expertise allows us to provide students with high-quality academic writing, editing & proofreading services.Free Features
Free revision policy
$10Free bibliography & reference
$8Free title page
$8Free formatting
$8How Our Essay Writing Service Works
First, you will need to complete an order form. It's not difficult but, in case there is anything you find not to be clear, you may always call us so that we can guide you through it. On the order form, you will need to include some basic information concerning your order: subject, topic, number of pages, etc. We also encourage our clients to upload any relevant information or sources that will help.
Complete the order formOnce we have all the information and instructions that we need, we select the most suitable writer for your assignment. While everything seems to be clear, the writer, who has complete knowledge of the subject, may need clarification from you. It is at that point that you would receive a call or email from us.
Writer’s assignmentAs soon as the writer has finished, it will be delivered both to the website and to your email address so that you will not miss it. If your deadline is close at hand, we will place a call to you to make sure that you receive the paper on time.
Completing the order and download