IS [NOT] DISTINCT FROM New T-SQL Enhancements in SQL Server

Overview

SQL Server 2022, the latest release of Microsoft’s popular relational database management system, has introduced several new features and enhancements that are aimed at improving developer productivity, query performance, and database management. One such feature is the addition of the IS DISTINCT FROM and IS NOT DISTINCT FROM predicates, which offer a more intuitive and comprehensive way to handle comparisons between nullable columns in SQL queries.

For SQL developers, data engineers, and database administrators (DBAs), dealing with NULL values in SQL can be tricky, especially when it comes to comparisons. The traditional equality operator (=) and its negation (<>) often fail to handle NULL values effectively, leading to confusion and unintended results. With the introduction of IS DISTINCT FROM and IS NOT DISTINCT FROM, SQL Server 2022 provides a more reliable and expressive way to compare nullable columns and handle NULL values.

The Problem with NULL Comparisons in SQL

It's important to understand the challenges that arise when dealing with NULL values in SQL. In SQL, NULL represents an unknown or missing value, and it behaves differently than other values in the database.

The standard equality (=) and inequality (<>) operators do not behave as expected when NULL values are involved.

  • When comparing two NULL values using =, the result is not TRUE, but rather UNKNOWN, because SQL treats NULL as an unknown value.
  • Similarly, comparing a NULL value with a non-NULL value using <> results in UNKNOWN, not TRUE.

Consider the following example. This behavior stems from the fact that NULL represents an unknown value, and comparing two unknown values logically does not lead to a definitive TRUE or FALSE result. Instead, SQL returns UNKNOWN, which is treated as FALSE in most boolean contexts.

SQL returns UNKNOWN

Traditional Solutions: IS NULL and IS NOT NULL

To work around this issue, SQL developers typically use the IS NULL and IS NOT NULL checks when comparing nullable columns. While this approach works, it can be verbose and error-prone, especially in complex queries involving multiple nullable columns.

SQL developers

The New IS DISTINCT FROM and IS NOT DISTINCT FROM Predicates

SQL Server 2022 addresses the challenge of NULL comparisons by introducing the IS DISTINCT FROM and IS NOT DISTINCT FROM predicates, which offer a more straightforward and intuitive way to compare nullable columns.

IS DISTINCT FROM: The IS DISTINCT FROM predicate compares two values and returns TRUE if the values are different, even when one or both of the values are NULL. In other words, IS DISTINCT FROM treats NULL as a regular value, and two NULL values are considered equal.

Syntax

value1 IS DISTINCT FROM value2

--TRUE if the two values are different, including cases where one value is NULL and the other is not.
--FALSE if the two values are the same, including cases where both values are NULL.

IS NOT DISTINCT FROM: The IS NOT DISTINCT FROM predicate is the logical opposite of IS DISTINCT FROM. It returns TRUE if the two values are the same, even when both values are NULL.

Syntax

value1 IS NOT DISTINCT FROM value2

--TRUE if the two values are the same, including cases where both values are NULL.
--FALSE if the two values are different, including cases where one value is NULL and the other is not.

Example Use Case

Consider the following example, where we have a table Employees with nullable columns Salary and Bonus. We want to compare the Salary and Bonus columns and determine whether they are distinct or not, even when NULL values are involved.

NULL values

To compare the Salary and Bonus columns using traditional SQL operators, we would need to write a verbose query.

SELECT
    EmployeeID,
    CASE 
        WHEN Salary IS NULL AND Bonus IS NULL THEN 'Equal'
        WHEN Salary = Bonus THEN 'Equal'
        ELSE 'Distinct'
    END AS ComparisonResult
FROM dbo.Employees

Using IS DISTINCT FROM, the query becomes much simpler.

SELECT
    EmployeeID,
    CASE 
        WHEN Salary IS DISTINCT FROM Bonus THEN 'Distinct'
        ELSE 'Equal'
    END AS ComparisonResult
FROM dbo.Employees

Output

Output

Advantages

  1. Simplified Code: As demonstrated in the previous example, the IS DISTINCT FROM and IS NOT DISTINCT FROM predicates allow for more concise and readable SQL queries. By eliminating the need for IS NULL and IS NOT NULL checks, these predicates reduce the verbosity of the code and make it easier to understand.
  2. More Intuitive Handling of NULL Values: These predicates provide a more intuitive way to compare nullable columns, as they treat NULL values as regular values. This is especially useful in scenarios where developers need to compare nullable columns without worrying about the special handling of NULL.
  3. Improved Consistency: The traditional behavior of SQL's equality (=) and inequality (<>) operators can be confusing when NULL values are involved. By using IS DISTINCT FROM and IS NOT DISTINCT FROM, developers can ensure consistent behavior in their queries, regardless of whether NULL values are present.

Use Cases

1. Data Deduplication

In data deduplication tasks, developers often need to identify and remove duplicate rows from a table. When nullable columns are involved, the IS DISTINCT FROM predicate can help ensure that NULL values are treated consistently.

For example, to identify duplicate rows in a table with nullable columns.

WITH DuplicateRows AS (
    SELECT
        EmployeeID,
        ROW_NUMBER() OVER (PARTITION BY Salary, Bonus ORDER BY EmployeeID) AS RowNum
    FROM dbo.Employees
    WHERE Salary IS NOT DISTINCT FROM Bonus
)
DELETE FROM DuplicateRows WHERE RowNum > 1

2. Conditional Updates and Inserts

In scenarios where developers need to perform conditional updates or inserts based on comparisons of nullable columns, IS DISTINCT FROM and IS NOT DISTINCT FROM can simplify the logic and ensure accurate results.

For example, to update the Bonus column only if it is distinct from the Salary column.

UPDATE dbo.Employees
SET Bonus = Salary * 0.1
WHERE Salary IS DISTINCT FROM Bonus;

3. Data Comparison and Synchronization

In data comparison and synchronization tasks, developers often need to compare rows in two different tables to identify changes or discrepancies. When nullable columns are involved, the traditional comparison operators may not behave as expected, leading to incorrect results.

For example, consider two tables, dbo.Employees and dbo.Employees1, which stores employee salary information. We want to compare the two tables and identify rows where the salary has changed. In this query, IS DISTINCT FROM ensures that the comparison works correctly, even when one or both Salary values are NULL.

SELECT
    a.EmployeeID,
    a.Salary AS SalaryInTableA,
    b.Salary AS SalaryInTableB
FROM
    dbo.Employees a
JOIN
    dbo.Employees1 b
    ON a.EmployeeID = b.EmployeeID
WHERE
    a.Salary IS DISTINCT FROM b.Salary;

Conclusion

The introduction of the IS DISTINCT FROM and IS NOT DISTINCT FROM predicates in SQL Server 2022 is a welcome enhancement for SQL developers, data engineers, and DBAs. These new predicates simplify the process of comparing nullable columns, reduce the complexity of SQL queries, and provide more intuitive and consistent


Similar Articles