Introduction
When working with temporary tables in SQL Server, developers often face a common dilemma: Should they use INSERT INTO or SELECT INTO? Both are widely used approaches for populating tables, but their effectiveness depends on the use case, particularly when it comes to performance, metadata caching, and concurrency. In this article, we’ll explore the differences between these two methods, their impact on temporary tables, and practical recommendations for SQL developers, data engineers, and DBAs. Using examples from the AdventureWorks2022 database, we’ll make the concepts clear and actionable.
Understanding INSERT INTO
The INSERT INTO statement is used to insert data into an existing table. This approach requires you to create the table structure in advance, allowing you to define constraints, indexes, and keys inline with the table definition.
Syntax
INSERT INTO TableName (Column1, Column2, ...)
SELECT Column1, Column2, ...
FROM SourceTable
This approach gives you control over the table's schema and ensures that all necessary metadata, such as constraints or indexes, is pre-defined.
Example
USE AdventureWorks2022
GO
-- Step 1: Create the temp table
CREATE TABLE #ProductInfo (
ProductID INT PRIMARY KEY,
Name NVARCHAR(100),
ProductNumber NVARCHAR(50)
)
-- Step 2: Insert data into the temp table
INSERT INTO #ProductInfo (ProductID, Name, ProductNumber)
SELECT ProductID, Name, ProductNumber
FROM Production.Product
-- Verify the data
SELECT * FROM #ProductInfo
Output
Understanding SELECT INTO
The SELECT INTO statement creates a new table on the fly and inserts data into it at the same time. Unlike INSERT INTO, the table does not need to be created beforehand.
Syntax
SELECT Column1, Column2, ...
INTO NewTableName
FROM SourceTable
This method is quicker when working with ad-hoc queries since you don’t have to define the table structure beforehand. However, it comes with limitations, especially for temporary tables in performance-critical environments.
Example
USE AdventureWorks2022
GO
-- Create and populate a temp table using SELECT INTO
SELECT ProductID, Name, ProductNumber
INTO #ProductInfo
FROM Production.Product
-- Verify the data
SELECT * FROM #ProductInfo
Output
INSERT INTO vs SELECT INTO
Differences |
INSERT INTO |
SELECT INTO |
Table Creation |
Requires the table to be created beforehand. This gives you the flexibility to define indexes, keys, and constraints during creation. |
Creates the table automatically based on the SELECT query. However, it doesn’t allow you to define indexes or constraints inline. |
Indexing and Constraints |
Supports indexes and constraints, which can improve query performance. |
Requires you to add indexes or constraints after the table is created, which can be less efficient. |
Caching and Metadata |
Caches the metadata of the temporary table when used inside a stored procedure. This can significantly reduce metadata contention in tempdb during high-concurrency scenarios. |
Does not cache metadata, causing the temporary table to be created from scratch every time. This can lead to higher PAGELATCH_EX contention on system objects in tempdb. |
Performance |
Slightly slower than SELECT INTO for one-time ad-hoc operations because the table must be created first. |
Faster for ad-hoc queries since it combines table creation and data insertion in a single operation. |
Flexibility |
More flexible for complex operations, such as handling schema changes or defining constraints and indexes. |
Best suited for simple, one-time data population tasks. |
When to Use INSERT INTO?
- Use INSERT INTO when working with temporary tables inside stored procedures.
- Always create the table with necessary indexes, keys, and constraints beforehand to leverage metadata caching.
Example
USE AdventureWorks2022
GO
CREATE TABLE #SalesData (
SalesOrderID INT PRIMARY KEY,
OrderDate DATETIME,
TotalDue MONEY
)
INSERT INTO #SalesData (SalesOrderID, OrderDate, TotalDue)
SELECT SalesOrderID, OrderDate, TotalDue
FROM Sales.SalesOrderHeader
When to Use SELECT INTO?
- Use SELECT INTO for quick, ad-hoc data population tasks where performance is less critical.
- Avoid using SELECT INTO in stored procedures or high-concurrency environments.
Example
USE AdventureWorks2022
GO
SELECT SalesOrderID, OrderDate, TotalDue
INTO #QuickSalesData
FROM Sales.SalesOrderHeader
Why INSERT INTO Is Better for Temp Tables
- Metadata Caching: When you create a temp table beforehand (as in INSERT INTO), SQL Server can cache its metadata if the temp table is used inside a stored procedure. This allows the table structure to be reused without additional overhead. Caching reduces metadata contention on tempdb, avoiding PAGELATCH_EX waits on system tables.
- Avoiding Schema Modifications: With INSERT INTO, the schema is fixed, and you avoid altering the table structure later, which invalidates metadata caching. Altering a temp table's schema after creation (e.g., adding indexes post SELECT INTO) prevents SQL Server from caching the table metadata.
- Concurrency: In high-concurrency scenarios, the ability to reuse cached metadata can improve performance and reduce contention in tempdb.
Best Practices for Temp Tables
- Wrap temp tables inside stored procedures to take advantage of metadata caching.
- Avoid altering the schema of temp tables after creation.
- Define indexes in line with the table definition (supported since SQL Server 2014).
Conclusion
While both INSERT INTO and SELECT INTO have their use cases, INSERT INTO is generally better for performance-critical scenarios involving temporary tables. The ability to define constraints and indexes upfront, combined with metadata caching, makes it the preferred choice for stored procedures and high-concurrency workloads. However, SELECT INTO remains a handy tool for quick, one-off operations where schema flexibility and speed are prioritized over performance optimization. By understanding their differences and knowing when to use each method, SQL developers and DBAs can make informed decisions to optimize their workflows.