Load And Convert A Document From Amazon S3 Or Azure Blob Storage

Data processing is the backbone of any organization. It mostly gives insight about a  company's progress or a way to keep track of the progress. There are almost five stages of data processing and Data conversion is one of the important stage.
 

Data/Document Conversion

 
A document basically possesses some data. It could be a PDF, Word, Spreadsheet, an Image file or a Drawing (e.g. AutoCAD file). For example, your business need is to take an AutoCAD file as input from a cloud storage (e.g. Amazon S3, Azure Blob) and convert it to PDF or a Presentation format. So that you can present/explain it to the concerned community without a need of installing AutoCAD. 
 
At this point, you really need a documnet converter. Isn't it better to write your own document conversion application?
 

Introduction to the Document Conversion API 

 
In this blog post, we'll see how we could pull a document from cloud storage, pass it to the document conversion API and generate an output (converted file). The API we'll use is GroupDocs.Conversion for .NET. 
 
This back-end, UI-Agnostic API could be integrated in any .NET application irrespective of platform or framework dependecies. Have a look at the minimum requriements to get started with and this installation guide.
 

Processing or Implementation 

 
Pull Source file from S3 Storage and Convert it to PDF 
 
The minimal code to meet the requirement is given below,
  1. public static void Run()  
  2. {  
  3.     string key = "sample.docx";  
  4.     string outputFile = Path.Combine("c:\output" , "converted.pdf");  
  5.     using (Converter converter = new Converter(() => DownloadFile(key)))  
  6.     {  
  7.         PdfConvertOptions options = new PdfConvertOptions();  
  8.         converter.Convert(outputFile, options);  
  9.     }  
  10. }  
  11.           
  12. public static Stream DownloadFile(string key)  
  13. {  
  14.     AmazonS3Client client = new AmazonS3Client();  
  15.     string bucketName = "my-bucket";  
  16.     GetObjectRequest request = new GetObjectRequest  
  17.     {  
  18.         Key = key,  
  19.         BucketName = bucketName  
  20.     };  
  21.     using (GetObjectResponse response = client.GetObject(request))  
  22.     {  
  23.         MemoryStream stream = new MemoryStream();  
  24.         response.ResponseStream.CopyTo(stream);  
  25.         stream.Position = 0;  
  26.         return stream;  
  27.     }  
  28. }  
We're basically fetching a Word file from S3 Storage and passing it to the Converter class. That later converts it to the PDF.
 
Fetch Source file from Azure Blob Storage  
  1. public static void Run()  
  2. {  
  3.     string blobName = "sample.docx";  
  4.     string outputFile = Path.Combine("c:\output""converted.pdf");  
  5.     using (Converter converter = new Converter(() => DownloadFile(blobName)))  
  6.     {  
  7.         PdfConvertOptions options = new PdfConvertOptions();  
  8.         converter.Convert(outputFile, options);  
  9.     }  
  10. }  
  11.           
  12. public static Stream DownloadFile(string blobName)  
  13. {  
  14.     CloudBlobContainer container = GetContainer();  
  15.     CloudBlob blob = container.GetBlobReference(blobName);  
  16.     MemoryStream memoryStream = new MemoryStream();  
  17.     blob.DownloadToStream(memoryStream);  
  18.     memoryStream.Position = 0;  
  19.     return memoryStream;  
  20. }  
  21. private static CloudBlobContainer GetContainer()  
  22. {  
  23.     string accountName = "***";  
  24.     string accountKey = "***";  
  25.     string endpoint = $"https://{accountName}.blob.core.windows.net/";  
  26.     string containerName = "***";  
  27.     StorageCredentials storageCredentials = new StorageCredentials(accountName, accountKey);  
  28.     CloudStorageAccount cloudStorageAccount = new CloudStorageAccount(  
  29.         storageCredentials, new Uri(endpoint), nullnullnull);  
  30.     CloudBlobClient cloudBlobClient = cloudStorageAccount.CreateCloudBlobClient();  
  31.     CloudBlobContainer container = cloudBlobClient.GetContainerReference(containerName);  
  32.     container.CreateIfNotExists();  
  33.     return container;  
  34. }   
You can go ahead and implement your own document conversion applciation. However, in case of any issue, post it here.
Next Recommended Reading AWS S3 Encryption