Microsoft Word is an application we use in our day-to-day life. But have you ever thought we could perform some automation using C# in Word Documents? In this article, we will write code to get hyperlinks from the Word Document.
Before we begin with writing the code you should have the following 3 namespaces added to your C# project.
using System;
using System.Data;
using Word = Microsoft.Office.Interop.Word;
Step 1
Since we are going to store all the links in a data table we will start with creating a data table and adding one column to it.
DataTable dtHyperlinks = new DataTable();
dtHyperlinks.Columns.Add("Hyperlinks");
Step 2
In this step, we will create objects for initializing Word Application, Document,
string FilePath = "Add your word file path here";
Word.Application wApp = new Word.Application(); -> This line of code is create an instance of Word Application.
Word.Document wDoc = wApp.Documents.Open(FileName: FilePath); -> This line of code is create an instance of Word Document.
FilePath is a string variable we have created that will store the file path from which we have to extract the hyperlinks.
Step 3
Now we will create a Hyperlinks object and store the get the hyperlinks in the document using the document object.
Layman's explanation of this step - We need hyperlinks from a document right? So we will use the document object and get hyperlinks using the hyperlink Object.
Word.Hyperlinks wLinks = wDoc.Hyperlinks; -> with this line of code we will get all the hyperlinks in the wLinks object.
Step 4
Now we have all the hyperlinks in the wLinks object so we will iterate through it and add rows in our data table dtHyperlinks,
for (int i = 1; i <= wLinks.Count; i++) {
string c = wLinks[i].Address; - > here we are getting the address from the hyerplink fetched
dtHyperlinks.Rows.Add(c);
}
But wait never forget to follow the coding best practices so let's write few lines to save and close the document also to clear the garbage collection.
wApp.Options.WarnBeforeSavingPrintingSendingMarkup = false;
wDoc.Save(); -> saving the document
wDoc.Close(); -> closing the document
GC.Collect();
GC.WaitForPendingFinalizers();
Final Code
DataTable dtHyperlinks = new DataTable();
dtHyperlinks.Columns.Add("Hyperlinks");
Word.Application wApp = new Word.Application();
Word.Document wDoc = wApp.Documents.Open(FileName: FilePath);
Word.Hyperlinks wLinks = wDoc.Hyperlinks;
for (int i = 1; i <= wLinks.Count; i++) {
string c = wLinks[i].Address;
dtHyperlinks.Rows.Add(c);
}
wApp.Options.WarnBeforeSavingPrintingSendingMarkup = false;
wDoc.Save();
wDoc.Close();
GC.Collect();
GC.WaitForPendingFinalizers();
That's it, with just a few lines of code we are now able to extract all the hyperlinks from a Word Document and store them in DataTable.