JavaScript Postback Download Via PowerShell

Introduction

 
Recently, one of my clients had a need to automatically download a file from a public-facing state government website. Normally, this can easily be done in a number of ways. Powershell is the first way that comes to mind but you could also utilize scripting tools, such as wget or curl just to name a couple.  However, thanks to the awesome power (note: sarcasm) of .NET Nuke, the download link is hidden behind JavaScript postback functionality.
 
SQL Server
 
Essentially, a postback is where a web page contains a form that consumes the data. This consumable data can be text fields or even a button/link click. When the form is submitted, the data from the form is then sent back to the same page that the form originated from. This is the “postback” of the process. The form “posts” data back to itself and returns the appropriate results. In this case, the result is a downloadable file.
 
Disclaimer: I’m not a Powershell expert by any means. Also, the script requires PowerShell 3.0 or higher due to some of the cmdlets used.
 

Script

 
In a nutshell, the script has to perform in the following manner.
  1.  Obtain the URL from a variable.
  2.  Get some session information so that we can “fake” out the server.
  3.  Add a user site and a form.
  4.  Create fields on the form and set them to the appropriate values.
  5.  Specify a file name.
  6.  Send the form back to the server.
So, let’s break this apart and walk through it.
 
Of course, this could be turned into a script that accepts variables directly. In this case, however, I don’t need to do this. The variable is statically set.
  1. #URL that needs to be fetched    
  2. $url = "https://site.state.gov/default.aspx"    
  3. #get the server name in case the process jumps to another script    
  4. $serverName = $env:computername  
While I am here, I also retrieve the server name. This is done so that if the script is going to be moved to another server, the process itself shouldn’t break. I tried to use a little forward thinking here.
 
Next, I’ll use the Invoke-WebRequest with the $url variable along with the -SessionVariable switch. This switch will create a web request session object and assign it to the specified variable, called “session”. Also, note that I am putting things into a TRY/CATCH block as I want to make sure something happens if things go south during this process.
  1. TRY {    
  2. #use invoke-webrequest to fetch a session from the site    
  3. Invoke-WebRequest $url -SessionVariable session -UseBasicParsing  
Now, we’ll call the Invoke-WebRequest cmdlet again against the same URL that was used originally. This will allow us to obtain the form from the page, which contains.
  1. #add a site using the session information from the above web request    
  2. $addUserSite = Invoke-WebRequest $url -WebSession $session #get the website $url using the session contained in $session    
  3. $addUserForm = $addUserSite.Forms[0] #Invoke-WebRequest does a lot of auto processing.  
With dealing with a postback process, there are usually certain fields on the form that are critical to the process. In this case, there were two fields that I needed:
  • __EVENTTARGET
  • __EVENTARGUMENT
Depending on the form and the requirements, the JavaScript could be expecting more or less fields so your mileage will vary. I used Fiddler to help identify what fields were needed.  I’ll blog about Fiddler in an upcoming post. You can also determine what is needed from the page source, you just have to go looking for it.
 
Now, that I know what fields I needed, I can continue with the script.
  1. #add form fields for the event target & argument that is needed to actually do the post back    
  2. $addUserForm.Fields["__EVENTTARGET"] = "dnn`$abcd1234`$File`$ExcelFile"    
  3. $addUserForm.Fields["__EVENTARGUMENT"] = ""  
I need to be able to download the file, so I need a file name.
  1. #where are we saving the file & what is file name      
  2. $filename = "C:\temp\Download_FIle_Name.txt"    
Now that all things are set, we do another Invoke-WebRequest and send everything back to the page. I’ve also included the “-Outfile” parameter here for the cmdlet and passed in the $fileName value that I set in the step above. The cmdlet will then download the file to the specified directory and file name. I also end the TRY block here.
  1. #invoke another web request using the same url, session information, and out put to the $filename variable    
  2. Invoke-WebRequest -uri $url -method post -Body $addUserForm.Fields -WebSession $session -UseBasicParsing -Outfile $fileName    
  3. }   
Finally, I start the CATCH block for the TRY/CATCH.  For this client, the best method was an email.  An email will be sent to their Help Desk provider which generates an automatic help desk ticket for the IT staff to investigate.
  1. # catch any errors from above and send an email to the right people    
  2. CATCH {    
  3. Send-MailMessage -To "[email protected]" -From "[email protected]" -Subject "Some important thing just happened" -SmtpServer "server.smtp.com" -Body "Check stuff out on $serverName"    
  4. }   
The completed script could now be scheduled based on the clients needs.  I put the script on a server and created a scheduled task within Windows.  You could also put this into a SQL Agent job if needed as long as the Powershell version of the SQL Server is sufficient.
 

Summary

 
Handling the postback proved to be more complicated than I had originally thought and it took trial and error. I had tried some other methods as well such as wget and curl but I found that using Powershell eventually was the better solution. Now, the client has an automated way to download the file for their needs without manual intervention.
 
You can get the full script from here.


Denny Cherry & Associates Consulting
Expert Consultants From HA to DR to up-time to SQL virtualization to scalability.