Azure batch task monitoring and cost saving

Azure Batch has come out of preview and its technical overview gives details of how it works. More interesting and detail article is provided here and also gives a brief description of Pool and VM Lifetime. The cost charged will be based on number of minutes number of VMs are running in a pool. It talks about two ways to maintain a pool. One is pre-defined and other is dynamic with dynamic aimed at cost efficiency. There are couple of examples posted also on MSDN pages that detail with Pool Creation and then tasks. Other examples that talk about Autopool creation through workitem.

A similarity in these code is using toolbox and wait for predefined period of time for tasks to complete. Below is the code snipped on it where a task monitor is created and wait for 10min.

  1. client.OpenToolbox().CreateTaskStateMonitor().WaitAll(job.ListTasks(), TaskState.Completed, TimeSpan.FromMinutes(10));  
  2.   
  3. Then stdout and stderr of the tasks are read and displayed in console.  
  4.   
  5. foreach (ICloudTask task1 in job.ListTasks())  
  6.   
  7. {  
  8.   
  9.    Console.WriteLine("Task " + task1.Name + task1.GetTaskFile(Constants.StandardOutFileName).ReadAsString() + " :\n");  
  10.   
  11. }  
We are going to extend this code for reading the state every 10min, take care of not reading the stdout when task is not complete, print only once per task and remove the vm of the task that is complete.

First we need to add check to above code to avoid exceptions due to our attempt to read a task output before its done. Add a condition inside the for loop to avoid reading the task output before its complete

Next we also add a while loop to handle keep reading the tasks status till all tasks are done. In the same code will remove the vm of task which is complete to save cost

Next to keep the console clean, we do not want to keep reading every task output, so we will use a dictionary to maintain list of tasks that where completed previously and not read them again.
  1. //Get the job to monitor status.  
  2.   
  3. ICloudJob job = wm.GetJob(WorkItemName, JobName);  
  4.   
  5. Console.WriteLine("Wait 10 minute for all tasks to reach the completed state");  
  6.   
  7. //Get PoolName of workitem  
  8.   
  9. string workItemPoolName = job.PoolName;  
  10.   
  11. IPoolManager pm = client.OpenPoolManager();  
  12.   
  13. ICloudPool pool = pm.GetPool(workItemPoolName);  
  14.   
  15. Dictionary < string, TaskState > taskState = new Dictionary < string, TaskState > ();  
  16.   
  17. while (taskState.Count < tasksToAdd.Count)  
  18.   
  19. {  
  20.   
  21.     client.OpenToolbox().CreateTaskStateMonitor().WaitAll(job.ListTasks(), TaskState.Completed, TimeSpan.FromMinutes(10));  
  22.   
  23.     foreach(ICloudTask task1 in wm.ListTasks(WorkItemName, JobName))  
  24.   
  25.     {  
  26.   
  27.         //Check in dictionary if task as already completed in previous read  
  28.   
  29.         if (!taskState.ContainsKey(task1.Name))  
  30.   
  31.         {  
  32.   
  33.             if (task1.State == TaskState.Completed)  
  34.   
  35.             {  
  36.   
  37.                 Console.WriteLine("StdOut for Task " + task1.Name + task1.GetTaskFile(Constants.StandardOutFileName).ReadAsString() + " :\n");  
  38.   
  39.                 Console.WriteLine("StdErr for Task " + task1.Name + task1.GetTaskFile(Constants.StandardErrorFileName).ReadAsString() + " :\n");  
  40.   
  41.                 //add to dictionary  
  42.   
  43.                 taskState.Add(task1.Name, task1.State);  
  44.   
  45.                 //Get the VM of task and remove. To be sure, set deallocation as taskcomplete  
  46.   
  47.                 VMInformation vmoftask = task1.VMInformation;  
  48.   
  49.                 Console.WriteLine("Deleting vm: " + vmoftask.VMName + "\n");  
  50.   
  51.                 IVM vmToDelete = pool.GetVM(vmoftask.VMName);  
  52.   
  53.                 vmToDelete.RemoveFromPool(TVMDeallocationOption.TaskCompletion);  
  54.   
  55.                 Console.WriteLine("Deleted vm: " + vmoftask.VMName + "\n");  
  56.   
  57.   
  58.             }  
  59.   
  60.             //Jsut print the task state if its not compplete  
  61.   
  62.             else Console.WriteLine("Task " + task1.Name + " is at" + task1.State + " state \n");  
  63.   
  64.         }  
  65.   
  66.     }  
  67.   
  68.     Console.WriteLine("Wait again 10 minute to get all tasks states. \n");  
  69.   
  70.     //Get latest Job object again  
  71.   
  72.     job.Refresh();  
  73.   
  74. }