Introduction
A Surrogate Key in SQL Server is a unique identifier for each row in the table. It is just a key. Using this key we can identify a unique row. There is no business meaning for Surrogate Keys. This type of key is either database generated or generated via another application (not supplied by user).
A Surrogate Key is just a unique identifier for each row and it may use as a Primary Key. There is only requirement for a surrogate Primary Key, which is that each row must have a unique value for that column. A Surrogate Key is also known as an artificial key or identity key. It can be used in data warehouses.
A Surrogate Key should have the following characteristics:
- Unique Value
- The key is generated by the system, in other words automatically generated
- The key is not visible to the user (not a part of the application)
- It is not composed of multiple keys
- There is no semantic meaning of the key
Generally, a Surrogate Key is a sequential unique number generated by SQL Server or the database itself. The purpose of a Surrogate Key is to act as the Primary Key. There is a slight difference between a Surrogate Key and a Primary Key. Ideally, every row has both a Primary Key and a Surrogate Key. The Primary Key identifies the unique row in the database while the Surrogate Key identifies a unique entity in the model.
Note that Surrogate Keys are never used with any business logic other than simple Create, Read, Update and Delete (CRUD) operations.
Example of Surrogate Key
- Identity Column in SQL Server
- GUID (Globally Unique Identifier)
- UUID (Universally Unique Identifier)
How can we implement Surrogate Key?
There are several ways to implement Surrogate Keys as in the following:
- Globally Unique Identifiers (GUID)
GUID is a Microsoft standard that extends Universally Unique Identifier (UUID). Using a NEWID() function we can generate a new GUID in SQL Server. It is a 16 byte GUID.
-
- DECLARE @newID UNIQUEIDENTIFIER
- SET @newID = NEWID()
NEWSEQUENTIALID() can be used with DEFAULT constraints on the table column of type uniqueidentifier. We cannot use the NEWSEQUENTIALID() function as a reference in queries.
- Universally unique identifier (UUID)
UUID is 128 bit values that are created from hash of the ID of Ethernet card and current data time of SQL Server.
Advantages of Surrogate Key
- A Surrogate Key does not change so the application cannot lose their reference row in the database.
- If the Primary Key is changed then the related foreign key does not change across the database because the Surrogate Key is used as a reference key. In other words, the Surrogate Key value is never changed, so the foreign key values become stable.
- A Surrogate Key is most often a compact data type such as an integer. A Surrogate Key is less expensive in a "Join" than the compound key.
- Business logic does not something in this key.
- A table always has a uniform Surrogate Key, so some tasks can be easily automated by writing the code table independently.
- There is no locking contention because it is a unique identifier.
- A Surrogate Key does not require an extra field; that helps to save space in the database.
- The relationship between any two tables is simple and consistent in SQL code expressions.
- Object Relational Mapping (ORM) frameworks such as Entity Framework, N-Hibernate, and so on are designed to work optimally with Surrogate Keys. It is very simple to implement them over the composite keys.
- It allows for a higher degree of normalization, so data is not duplicated within the database.
Disadvantages of Surrogate Key
- Additional index is required.
- It cannot be used as a search key because it is not related to any business logic or it is independent of any business logic.
- There is always a requirement to join to the main table when data is selected from a child table.
- It increases the sequential number by a random amount.
- There is some administrative overhand to maintain a Surrogate Key.
- Extra disk space required to store a Surrogate Key.
How Surrogate Key differs from the Natural Key?
The alternative to a Surrogate Key is Natural Keys. A Natural Key is a true unique identifier in the database. It is a single value or composite value that has business meaning. The Natural Key can be one or more columns with any data type. If there is no Surrogate Key on table then there is no need to create a unique index or sequence on a database table, so it helps us to reduce administrative overhead.
Disadvantages of Natural Key
- A query join may become complex because the Natural Key can have one or more columns.
- It is a reduced normalization form.
- It is very difficult to use and time consuming with ORM because ORM is designed to work best with Surrogate Keys.
- The key type is not consistent.
- More work is required to change a Natural Key when the foreign key relationship has been built by a Natural Key.
- A Natural Key is larger than a Surrogate Key.
- A Natural Key can be any data type, so it might require a long execution time in a "join" query. For example, if there is a VARCHAR data type as a Natural Key type then the join between two tables may take more time to produce output.
- A Natural Key is assigned by the application, so there is no way to know whether a record is new or an existing record.
Conclusion
A Surrogate Key is unique in the database table; it is just like an artificial or alternative key to a Primary Key because a Primary Key may be alphanumeric or a composite key. A Surrogate Key is always unique per table.
Surrogate Keys offer many benefits. Simplicity, consistency and stability, makes the use of an ORM extremely feasible. We can use a Natural Key instead of A Surrogate Key when A Natural Key is small and this key is never updated.