Introduction To Databases

Introduction

 
Before starting about SQL we must learn about the Database. So, this chapter will provide you with an introduction to the database. In this chapter, we will learn about the Database, its different types, and other terms about the database.
 

What are Databases?

 
A database is an organized collection of data, generally stored and accessed electronically from a computer system. It supports the storage and manipulation of data.
 
In other words, databases are used by an organization as a method of storing, managing and retrieving information.
 
 Introduction
 

Types of Databases

 
Depending upon the usage requirements, there are following types of databases available in the market:
  • Centralized database
  • Distributed database
  • Personal database
  • End-user database
  • Commercial database
  • NoSQL database
  • Operational database
  • Relational database
  • Cloud database
  • Object-oriented database
  • Graph database
 types of databases
 

Advantages of using Databases

 
There are many advantages of databases 
  • Reduced data redundancy
  • Reduced updating errors and increased consistency
  • Greater data integrity and independence from application programs
  • Improved data access to users through the use of host and query languages
  • Improved data security
  • Reduced data entry, storage, and retrieval costs

Disadvantages of using Databases

 
There are many disadvantages of databases
  • Although databases allow businesses to store and access data efficiently, they also have certain disadvantages
  • Complexity
  • Cost
  • Security
  • Compatibility

Some examples of Databases

 
Some of the most popular databases are 
  1. Oracle Database
  2. Sybase
  3. MySQL

History of Databases

 
The emergence of the first type of DBMS, the hierarchical DBMS. IBM had the first model, developed on IBM 360 and their (DBMS) was called IMS, originally it was written for the Apollo program. This type of DBMS was based on binary trees, where the shape was like a tree, and relations were only limited between parent and child records. The benefits were numerous; less redundant data, data independence, security, and integrity, which all lead to efficient searches. Nonetheless; there were some disadvantages such as; complex implementation, which was hard to manage because of the absence of standards, which made it harder to handle many relationships.
 

Early History of Databases

 
Before databases existed, everything had to be recorded on paper. We had lists, journals, ledgers, and endless archives containing hundreds of thousands or even millions of records contained in filing cabinets. When it was necessary to access one of these records, finding and physically obtaining the record was a slow and laborious task. There were often problems ranging from misplaced records to fires that wiped out entire archives and destroyed the history of societies, organizations, and governments. There were also security problems because physical access was often easy to gain.
 
The database was created to try and solve these limitations of traditional paper-based information storage. In databases, the files are called records, and the individual data elements in a record (for example, name, phone number, date of birth) are called fields. The way these elements are stored has evolved since the early days of databases.
The earliest systems were called the hierarchical and network models.
 
The hierarchical model organized data in a tree-like structure, as shown in fig and IBM developed this model in the 1960s.
 
history of databases
  • 1960: 2 models were in use when the concept of computerized DBMS started. It was at the same time that the use of computers became the choice for private organizations. Models used then CODASYL (N/W MODEL) and IMS (IBM’s hierarchical model). SABRE system by IBM was designed to help American Airlines manage reservations data.
  • 1969: IBM introduced its first-ever mainframe machine as System/360.
  • 1970-72: E.F. Codd published a vital paper stating the use of RDBMS wherein he mentioned the DB schema being disconnected from the physical information storage; which then became the gold standard for DBMS1970s: 2 major RDBMS prototypes were designed viz. INGRES by UBC and System R by IBM San Jose. While INGRES designed query language QUEL which ultimately resulted in INGRES CORP., MS SQL SERVER, SYBASE, WANG’S PACE, and BRITTON-LEE; SystemR used the query language ‘SEQUEL’. This then resulted in the development of SQL/DS, DB2, Allbase, Oracle, and Non-stop SQL.
  • 1976: ERD also known as entity-relationship diagrams came into play. These were proposed by P. Chen. They are also known as the conceptual models which focused more on the data application rather than logical tabular structure.
  • 1980s: SQL became the standard query language. RDBMS was widely popular and DB2 became a flagship product by IBM. Later on, the introduction of IBM PCs further resulted in several new DB companies and products such as RBASE 5000 and RIM, PARADOX
  • Early 90s: post DB industry shakeout, surviving companies sold their products at high prices. Meanwhile, new client tools for developing applications were released. These included Oracle Developer, VB, and PowerBuilder. ODBC prototypes and Excel/Access were also developed within the same timeframe.
  • The Mid 90s: Internet became popular which caused the exponential growth of the DB industry. More people with average desktops started using client/server systems for legal data.
  • Late 90s: investment in online business resulted in internet DB connectors like FrontPage, Active Server Pages, Java Servlets, Dream Weaver, Oracle Developer 2000, and Enterprise Java Beans. Use of Apache, MySQL, and other systems introduced open-source solutions to the Internet. Gradually, online transaction processing and analytic processing became popular.
  • The 2000s: DB applications were not affected by the decline of the internet industry. Interactive applications for PDAs, point-of-sale transactions, and consolidation of vendors were developed.
  • Present: Currently, Microsoft, Oracle, and IBM are the leading companies for Database Systems.

Database Management System

 
A database management system (DBMS) is a software package designed to define, manipulate, retrieve, and manage data in a database. A DBMS generally manipulates the data itself, the data format, field names, record structure, and file structure. It also defines rules to validate and manipulate this data.
 
In other words, a database management system is a combination of hardware and software that can be used to set up and monitor a database and can manage the updation and retrieval of the database that has been stored in it.
 

Types of Databases

 
There are 4 major types of DBMS. Let's look into them in detail.
  • Hierarchical - this type of DBMS employs the "parent-child" relationship of storing data. This type of DBMS is rarely used nowadays. Its structure is like a tree with nodes representing records and branches representing fields. The windows registry used in Windows XP is an example of a hierarchical database. Configuration settings are stored as tree structures with nodes.
  • Network DBMS - this type of DBMS supports many-to-many relations. This usually results in complex database structures. RDM Server is an example of a database management system that implements the network model.
  • Relational DBMS - this type of DBMS defines database relationships in the form of tables, also known as relations. Unlike network DBMS, RDBMS does not support many to many relationships. Relational DBMS usually have pre-defined data types that they can support. This is the most popular DBMS type in the market. Examples of relational database management systems include MySQL, Oracle, and Microsoft SQL Server database.
  • Object-Oriented DBMS - this type supports the storage of new data types. The data to be stored is in the form of objects. The objects to be stored in the database have attributes (i.e. gender, ager) and methods that define what to do with the data. PostgreSQL is an example of an object-oriented relational DBMS.

Hierarchical DBMS

 
A hierarchical model represents the data in a tree-like structure in which there is a single parent for each record. To maintain order there is a sort field that keeps sibling nodes into a recorded manner. These types of models are designed basically for the early mainframe database management systems, like the Information Management System (IMS) by IBM.
 
This model structure allows the one-to-one and a one-to-many relationship between two/more types of data. This structure is very helpful in describing many relationships in the real world; table of contents, any nested, and sorting information.
 
The hierarchical structure is used as the physical order of records in storage. One can access the records by navigating down through the data structure using pointers which are combined with sequential access. Therefore, the hierarchical structure is not suitable for certain database operations when a full path is not also included for each record.
 
Data in this type of database is structured hierarchically and is typically developed as an inverted tree. The "root" in the structure is a single table in the database and other tables act as the branches flowing from the root. The diagram below shows a typical hierarchical database structure.
 
 Hierarchical
A relationship in this database model is represented by the term parent/child. A parent table can be linked with one or more child tables in this type of relationship, but a single child table can be linked with only one parent table. The tables are explicitly linked via a pointer/index or by the physical arrangement of the records within the tables.
 
A user can access the data by starting at the root table and working down through the tree to the target data. the user must be familiar with the structure of the database to access the data without any complexity.
 
The IBM Information Management System (IMS) and the RDM Mobile are examples of a hierarchical database system with multiple hierarchies over the same data.
 

Advantages

 
Let's discuss some advantages of Hierarchical DBMS
  1. A user can retrieve data very quickly due to the presence of explicit links between the table structures.
  2. The referential integrity is built-in and automatically enforced due to which a record in a child table must be linked to an existing record in a parent table, along with that if a record deleted in the parent table then that will cause all associated records in the child table to be deleted as well.

Disadvantages

 
Let's discuss some disadvantages of Hierarchical DBMS 
  1. When a user needs to store a record in a child table that is currently unrelated to any record in a parent table, it gets difficulty in recording and the user must record an additional entry in the parent table.
     
  2. This type of database cannot support complex relationships, and there is also a problem of redundancy, which can result in producing inaccurate information due to the inconsistent recording of data at various sites.

Network DBMS

 
The network model is the extension of the hierarchical structure because it allows many-to-many relationships to be managed in a tree-like structure that allows multiple parents.
 
In other words, Network databases are hierarchical databases but unlike hierarchical databases where one node can have one parent only, a network node can have a relationship with multiple entities. A network database looks more like a cobweb or interconnected network of records.
 
In network databases, children are called members and parents are called occupiers. The difference between each child or member can have more than one parent.
 
There are two fundamental concepts of a network model:
  1. Records contain fields that need hierarchical organization.
  2. Sets are used to define one-to-many relationships between records that contain one owner, many members.
A record may act like an owner in any number of sets, and a member in any number of sets. 
 
Some well-known database systems that use the network model include:
  • Integrated Data Store (IDS)
  • IDMS (Integrated Database Management System)
 
 

History of Network Model

 
The network database structure was invented by Charles Bachman. Some of the popular network databases are the Integrated Data Store (IDS), IDMS (Integrated Database Management System), Raima Database Manager, TurboIMAGE, and Univac DMS-1100.
 

Advantages

 
Let's discuss some advantages of Network DBMS 
  1. Fast data access
  2. It also allows users to create queries that are more complex than those they created using a hierarchical database. So, a variety of queries can be run over this model.

Disadvantages

 
Let's discuss some disadvantages of Network DBMS 
  1. A user must be very familiar with the structure of the database to work through the set structures.
  2. Updating inside this database is a tedious task. One cannot change a set structure without affecting the application programs that use this structure to navigate through the data. If you change a set structure, you must also modify all references made from within the application program to that structure.

Relational DBMS

 
RDBMS stands for Relational Database Management Systems.
 
All modern database management systems like SQL, MS SQL Server, IBM DB2, ORACLE, My-SQL, and Microsoft Access are based on RDBMS.
 
It is called Relational Database Management System (RDBMS) because it is based on the relational model introduced by E.F. Codd.
 
Standard relational databases enable users to manage predefined data relationships across multiple databases.
 
Popular examples of relational databases include Microsoft SQL Server, Oracle Database, MySQL, and IBM DB2.
Relational databases work on each table has a key field that uniquely indicates each row, and that these key fields can be used to connect one table of data to another.
 

The relational database has two major reasons

  1. Relational databases can be used with little or no training.
  2. Database entries can be modified without specifying the entire body.

Advantages

 
Let's discuss some advantages of Relational DBMS 
  • Simplicity
  • Ease of Data Retrieval
  • Data Integrity
  • Flexibility

Disadvantages

 
Let's discuss some disadvantages of Relational DBMS 
  • They tend to be slow and not scalable. If you have more servers you can’t always do more work with them.
  • They have a fixed schema which is a plus unless this hurts productivity too much.
  • Tables don’t always map to objects in applications very well.
  • They are not secure enough to expose to the internet and need a layer to be added to protect them.
  • They are not good at modeling certain kinds of data such as graphs and geo-spatial queries.
  • They are not at storing very large records.

Object-Oriented Model

 
An object-oriented database management system (OODBMS) is a database management system that supports the creation and modeling of data as objects. OODBMS also includes support for classes of objects and the inheritance of class properties and incorporates methods, subclasses, and objects. Most of the object databases also offer some kind of query language, permitting objects to be found through a declarative programming approach. Also called an object database management system (ODMS).
 
In other words, An object-oriented database management system represents information in the form of objects as used in object-oriented programming. OODBMS allows object-oriented programmers to develop products, store them as objects, and replicate or modify existing objects to produce new ones within OODBMS. OODBMS allows programmers to enjoy the consistency that comes with one programming environment because the database is integrated with the programming language and uses the same representation model.
 
Object Oriented DBMS
 
Object-oriented databases use small, recyclable separated from software called objects. The objects themselves are stored in the object-oriented database. Each object contains two elements:
  1. Piece of data (e.g., sound, video, text, or graphics)
  2. Instructions, or software programs called methods, for what to do with the data
Object-oriented database management systems (OODBMs) were created in the early 1980s. Some OODBMs were designed to work with OOP languages such as Delphi, Ruby, C++, Java, and Python.
 
Some POPULAR EXAMPLES of OODBMs are TORNADO, Gemstone, ObjectStore, GBase, VBase, InterSystems Cache, Versant Object Database, ODABA, ZODB, Poet. JADE, and Informix.
 

Advantages of Object-Oriented databases

 
Let's discuss some advantages of Object-Oriented DBMS 
  1. Enriched modeling capabilities
  2. Extensibility
  3. Capable of handling a large variety of data types
  4. Capable of handling a large variety of data types
  5. Support for schema evolution
  6. Improved performance

The disadvantages of Object-oriented databases

 
Let's discuss some disadvantages of Object-Oriented DBMS 
  1. An object-oriented database is more expensive to develop
  2. Most organizations are unwilling to abandon and convert from those databases

Summary

 
In the above chapter, we studied databases, types of databases, advantages, and disadvantages of each type of database, history, and timeline of databases.
 
In the next chapter, we will start with SQL.
Author
Onkar Sharma
50 29.9k 9.3m
Next » Introduction To SQL And SQL Commands