Introduction
Many of our application works as a subsidy of one or more parent application. For that reason, many of our application has to collect data from the different systems in different formats, to provide users with a unified view of them. Getting/Copying data from different places where they live, into a common host or place is known as data integration. According to Wiki - "Data integration involves combining data residing in different sources and providing users with a unified view of them."
What we'll do
In this article will integrate data from a text file to an MSSQL database. We'll read/extract data from a text file, from a specific location and move the file to another location on the same server. It is always better to do this type of task with SSIS. But sometimes SSIS creates extra dependencies for the developers. So they try to do it using SQL Server. You can do this with the following steps. It'll dynamically read your text file from a specific location(i.e: D:\NewData\) and also will move the file(s) to another location (i.e: D:\OldData\) after reading. Let's do it step by step.
STEP 1
Suppose you have the following text files inside a folder that need to extract and store into an SQL table. Just copy & paste the following data to your Data_01012021.txt text file, inside the NewData folder.
AC001 |
,Raju Ahmed 1, |
1945-11-01 00:00:00.000; |
AC002 |
,Raju Ahmed 2, |
1995-11-02 00:00:00.000; |
AC003 |
,Raju Ahmed 3, |
1999-05-03 00:00:00.000; |
AC004 |
,Raju Ahmed 4, |
1989-09-18 00:00:00.000; |
AC005 |
,Raju Ahmed 5, |
1989-12-25 00:00:00.000; |
AC006 |
,Raju Ahmed 6, |
1989-11-14 00:00:00.000; |
AC007 |
,Raju Ahmed 7, |
1952-02-27 00:00:00.000; |
AC008 |
,Raju Ahmed 8, |
1912-04-06 00:00:00.000; |
AC009 |
,Raju Ahmed 9, |
1919-07-08 00:00:00.000; |
AC010 |
,Raju Ahmed 10 |
1989-09-18 00:00:00.000; |
AC011 |
,Raju Ahmed 11 |
1956-08-27 00:00:00.000; |
AC012 |
,Raju Ahmed 12 |
1982-02-21 00:00:00.000; |
From here copy & paste the following data to your Data_02012021.txt text file, inside the NewData folder.
AC013 |
,Raju Ahmed 13, |
1945-11-01 00:00:00.000; |
AC014 |
,Raju Ahmed 14, |
1995-11-02 00:00:00.000; |
AC015 |
,Raju Ahmed 15, |
1999-05-03 00:00:00.000; |
AC016 |
,Raju Ahmed 16, |
1989-09-18 00:00:00.000; |
AC017 |
,Raju Ahmed 17, |
1989-12-25 00:00:00.000; |
AC018 |
,Raju Ahmed 18, |
1989-11-14 00:00:00.000; |
AC019 |
,Raju Ahmed 19, |
1952-02-27 00:00:00.000; |
AC020 |
,Raju Ahmed 20, |
1912-04-06 00:00:00.000; |
AC021 |
,Raju Ahmed 21, |
1919-07-08 00:00:00.000; |
AC022 |
,Raju Ahmed 22, |
1989-09-18 00:00:00.000; |
AC023 |
,Raju Ahmed 23, |
1956-08-27 00:00:00.000; |
AC024 |
,Raju Ahmed 24, |
1982-02-21 00:00:00.000; |
STEP 2
And to store the extracted data from the mentioned files, you have the following database and table to store data from a text file. Execute the following script to create DB & table.
CREATE DATABASE [DataIntegrationDB]
GO
USE [DataIntegrationDB]
GO
CREATE TABLE [dbo].[tbl_NewVendorData](
[AccountNumber] [nvarchar](15) NOT NULL,
[Name] [nvarchar](50) NOT NULL,
[ModificationDate] [datetime] NOT NULL,
PRIMARY KEY CLUSTERED
(
[AccountNumber] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
STEP 3
In this step, we'll create an SP to read a text file and move the files into a new location. The following SP will take two-parameter, one will take the path from where the text file will be read and another parameter will take the path to move the files.
CREATE PROC uspReadTextFileAndMove
(
@oldDir VARCHAR(1000)='D:\NewData',
@newDir VARCHAR(1000)='D:\OldData'
)
AS
BEGIN
DECLARE @command VARCHAR(1000);
DECLARE @insertCommand VARCHAR(1000);
DECLARE @cmdFileMove VARCHAR(1000);
DECLARE @files TABLE ([Id] INT IDENTITY, [FileName] VARCHAR(1000));
DECLARE @FileToRead VARCHAR(1000)
SET @command = 'dir /b ' + @fileDir +'\*.txt';
INSERT INTO @files EXECUTE xp_cmdshell @command;
DECLARE fileReadCursor CURSOR FOR
SELECT * FROM @files WHERE [FileName] IS NOT NULL AND [FileName] <>'File Not Found'
OPEN fileReadCursor;
FETCH fileReadCursor INTO @FileToRead;
WHILE (@@FETCH_STATUS = 0)
BEGIN
SET @insertCommand= 'BULK INSERT [dbo].[tbl_NewVendorData] FROM ''' +@oldDir+'\'+ @FileToRead + ''' WITH ( FIELDTERMINATOR='','', ROWTERMINATOR='';\n'');'
EXECUTE(@insertCommand);
SET @cmdFileMove= 'MOVE ' + @oldDir + '\' + @FileToRead +' '+ @newDir +'\'+@FileToRead
EXEC master..xp_cmdshell @cmdFileMove
--PRINT @insertCommand;
--PRINT @FileToRead;
--PRINT @cmdFileMove;
FETCH fileReadCursor INTO @FileToRead;
END;
CLOSE fileReadCursor;
DEALLOCATE fileReadCursor;
END
STEP 4
Now we'll make a SQL server Job Scheduler to execute STEP 3 on a daily basis or hourly basis or whatever you want.
Change as per your need as guided by the comment line in the following script and run the script to create a Job scheduler, which will run after every 1 minute (you may change it hourly, daily, etc basis) & it will execute our SP.
USE tempdb
GO
IF OBJECT_ID('##tempJobTableConfig', 'U') IS NOT NULL
BEGIN
DROP TABLE ##tempJobTableConfig;
END
DECLARE @DatabaseName VARCHAR(200)='Codefirst' --Change Here: Change Your Database name.
DECLARE @TrialDays INT = 0 --Change Here: Change your trial days number from @TrialDays variable.
DECLARE @freq_subday_type_Variable INT= 4 --if you want it as minute change it value as 4,If value is 8 then it will be hour,
DECLARE @freq_subday_interval_Variable INT=1 --Change here as: If your @freq_subday_type=8 then 12 will be hour, @freq_subday_type=4 then 12 will be minutes,
SELECT @DatabaseName AS DatabaseName
,DATEADD(DAY,@TrialDays,GETDATE()) AS ExpairationDate
,ISNULL(@freq_subday_type_Variable,4) AS freq_subday_type_Variable
,ISNULL(@freq_subday_interval_Variable,15) AS freq_subday_interval_Variable
INTO ##tempJobTableConfig
USE [msdb]
GO
DECLARE @jobId2 BINARY(16)
SELECT @jobId2 = job_id FROM msdb.dbo.sysjobs WHERE (name = N'TextFileReadAndMoveJob')
IF (@jobId2 IS NOT NULL)
BEGIN
EXEC msdb.dbo.sp_delete_job @jobId2
END
BEGIN TRANSACTION
DECLARE @DateInIntData INT
DECLARE @jobId BINARY(16)
DECLARE @ReturnCode INT
DECLARE @User NVARCHAR(500)=(SELECT SUSER_NAME())
/*========================================================
Value of @DateInIntData will be like this =20210224, where
2021=year,02=month & 24=day & it'll dynamically add current
date as your schedular start date.
==========================================================*/
SELECT @ReturnCode = 0
SET @DateInIntData=CAST(
(CAST((SELECT DATEPART(YEAR,GETDATE())) AS VARCHAR(4))
+RIGHT('00'+CAST((SELECT DATEPART(MONTH,GETDATE())) AS VARCHAR(4)),2)
+RIGHT('00'+CAST((SELECT DATEPART(DAY,GETDATE())) AS VARCHAR(4)),2))
AS INT)
IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]' AND category_class=1)
BEGIN
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'[Uncategorized (Local)]'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
END
EXEC @ReturnCode = msdb.dbo.sp_add_job @job_name=N'TextFileReadAndMoveJob',
@enabled=1,
@notify_level_eventlog=0,
@notify_level_email=0,
@notify_level_netsend=0,
@notify_level_page=0,
@delete_level=0,
@description=N'No description available.',
@category_name=N'[Uncategorized (Local)]',
@owner_login_name=@User, @job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
DECLARE @databaseValue VARCHAR(400)=(SELECT DatabaseName FROM tempdb..##tempJobTableConfig)
DECLARE @commandValue NVARCHAR(500)=N'USE '+@databaseValue+'
GO
EXEC uspReadTextFileAndMove;
'
EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Execution',
@step_id=1,
@cmdexec_success_code=0,
@on_success_action=1,
@on_success_step_id=0,
@on_fail_action=2,
@on_fail_step_id=0,
@retry_attempts=0,
@retry_interval=0,
@os_run_priority=0, @subsystem=N'TSQL',
@command=@commandValue,
@database_name=N'master',
@flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
DECLARE @schedule_uid2 NVARCHAR(500)=(SELECT NEWID())
DECLARE @freq_subday_type_Value INT=(SELECT ISNULL(freq_subday_type_Variable,4) FROM tempdb..##tempJobTableConfig)
DECLARE @freq_subday_interval_Value INT=(SELECT ISNULL(freq_subday_interval_Variable,4) FROM tempdb..##tempJobTableConfig)
EXEC @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id=@jobId, @name=N'TextFileReadAndMoveJobSchedule',
@enabled=1,
@freq_type=4,
@freq_interval=1,
@freq_subday_type=@freq_subday_type_Value,
@freq_subday_interval=@freq_subday_interval_Value,
@freq_relative_interval=0,
@freq_recurrence_factor=0,
@active_start_date=@DateInIntData,
@active_end_date=99991231,
@active_start_time=0,
@active_end_time=235959,
@schedule_uid=@schedule_uid2
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:
GO
Note
You may need to enable 'xp_cmdshell's permission. To do that execute the following code.
-- this turns on advanced options and is needed to configure xp_cmdshell
EXEC sp_configure 'show advanced options', '1'
RECONFIGURE
-- this disables xp_cmdshell
EXEC sp_configure 'xp_cmdshell', '0'
RECONFIGURE