Problem Statement
The requirement is to extract the data from Twitter and load the extracted data into SQL Database. For this demonstration, I am going to read the Indian Film actor Kamal Haasan's tweets and load that into SQL Server 2016 Database.
Solution
The solution is to use Curl program to read from Twitter using Rest API then Load the output data into SQL Database
Let’s see the solution step by step
Step 1
First, we have to install and configure the curl.
“Curl ('Client for URLs') is a tool to transfer data from or to a server, using one of the supported protocols (HTTP, HTTPS, FTP, FTPS, SCP, SFTP, TFTP, DICT, TELNET, LDAP or FILE). The command is designed to work without user interaction“
Go to -> https://curl.haxx.se/download.html, download the latest version of Curl software under Win 32 - Generic or Win 64 – Generic based on your Windows OS bit.
Step 2
We don’t need to install the curl because the package is already built. Extract the downloaded zip file and copy curl.exe and ca-bundle.crt, paste that in “C:\Curl” (Create folder wherever you want).
Step 3
We have to setup the Environment variable path for Curl.
Go to My Computer -> Properties -> Advanced system settings -> Advanced -> Environment Variables -> Edit Path under System Variables section.
Click New -> Add the curl.exe folder location.
Run the below command in Windows command prompt to check whether Curl is installed and configured correctly.
Step 4
In order to read data from Twitter application, we have to create App in Twitter, Go to -> https://apps.twitter.com
Prerequisites
- The user should have a Twitter account.
- The user should have provided a mobile number in the Twitter account.
Create App by filling the required fields. If you do not have website details, give https://google.com/ a shot.
Click “Create your Twitter Application” button.
Step 5
Go to "Keys and Access Tokens" tab and note down Consumer Key (API Key) and Consumer Secret (API Secret).
Twitter supports OATH 2.0 Authentication so we have to generate Bearer token. For that, we need to convert API Key and API Secret in Base64 format. Go to https://www.base64encode.org/
Syntax
<Consumer Key (API Key)> : <Consumer Secret (API Secret)>
For example
ABCD:EFGH
Click Encode, then copy the encoded code - QUJDRDpFRkdI
Step 6
Enter the below command in Windows Command Prompt.
Syntax
Curl --compressed "https://api.twitter.com/oauth2/token"
--header "Accept-Encoding: identity" --header "Authorization: Basic <Base64Code>"
--header "Content-Type: application/x-www-form-urlencoded; charset=UTF-8"
--data "grant type=client credentials"
Example
Curl --compressed "https://api.twitter.com/oauth2/token" --header "Accept-Encoding: identity" --header "Authorization: Basic QUJDRDpFRkdI" --header "Content-Type: application/x-www-form-urlencoded;charset=UTF-8" --data "grant type=client credentials"
https://dev.twitter.com/oauth/application-only
The response will be in JSON format like below.
{"token_type":"bearer","access_token":"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"}
Note down the access token.
Step 6
Now, I am going to fetch tweets of the film actor Kamal Haasan. So I want to know his twitter account name
Account name - iKamalHaasan
Curl Command
Curl --get --compressed "https://api.twitter.com/1.1/statuses/user_timeline.json" --data "screen_name=ikamalhaasan" --data "count=200" --header "Authorization: Bearer XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" -o Kamal.txt
Note
Enter your bearer token.
Refer
https://dev.twitter.com/rest/reference/get/statuses/user_timeline
Output file is saved in the C:\Users\Madhan\Kamal_Tweets.txt
Step 7
Run the below SQL statement to load JSON Data into SQL Table.
- CREATE TABLE[dbo].[Tweets_JSON]
- (
- [JSON_STRING][varchar](max) NULL)
- BULK INSERT[dbo].[Tweets_JSON] FROM 'C:\Users\Madhan\Kamal_Tweets.txt'
Step 8
Run the below query to parse JSON String.
- SELECT J.*FROM[dbo].[Tweets_JSON]
- CROSS APPLY OPENJSON(JSON_STRING)
- With(
- [id_str] Varchar(100), [created_at] Varchar(100), [text] NVarchar(600), [source] Varchar(100), [retweet_count] Varchar(100), [favorite_count] Varchar(100), [lang] Varchar(100)
- )
Thank You!!