jagadeesh rp

jagadeesh rp

  • NA
  • 1
  • 1.3k

How to use task parallelism in case of large number of files

Dec 2 2014 8:22 PM
Hi Friends,
I have a function which reads a file from file system, updates it and writes back again. I have to do this task for large number of files.These file name entries are in the DataTable and i have created a DataView on this table.
The present sequential function is as like below, which is time consuming. I need to improve performance of it. Can i use task parallelism in file handling?(Because if you initiate multiple copy&paste operations on a folder/drive, they will parallely executed in a bit fast right). So, I would like to replace the for loop with Parallel.For or Parallel.ForEach
private void _RegenerateSerialNumber(List<int> i32IndexList)
{
for (int i32RowCount = 0; i32RowCount < SelectedMeasurementData.MeasurementData.Rows.Count; i32RowCount++)
{
SelectedMeasurementData.MeasurementData.Rows[i32RowCount][_strSerialNoCol] = i32RowCount + 1; //SelectedMeasurementData.MeasurementData is member of the class of type DataTable
}
DataView dvResetOverlayShapeRows = new DataView(_dtOverlayData);
dvResetOverlayShapeRows.RowFilter = _strMeasurementType + " = '" + SelectedMeasurementType.measurementType.ToString() + "'";
//_dtOverlayData is the member of the class of type DataTable
for (int i32RowCount = 0; i32RowCount < dvResetOverlayShapeRows.Count; i32RowCount++) //I want to replace this for loop with Parallel.For or ForEach
{
int i32NewID = i32RowCount + 1;
dvResetOverlayShapeRows[i32RowCount][_strOverlayShapeID] = i32NewID;
///*****************************************
///For Image Overlay
///*****************************************
//If the data is present in the memory, then update it in memory
if (dvResetOverlayShapeRows[i32RowCount][_strImageOverlayData] != null && dvResetOverlayShapeRows[i32RowCount][_strImageOverlayData].GetType() != typeof(System.DBNull))
{
((IImageShapeBehaviour)dvResetOverlayShapeRows[i32RowCount][_strImageOverlayData]).UpdateOverlayShapeID(i32NewID);
}
else //else update it on cache file.
{
BinaryFormatter clsBinaryFormatter = new BinaryFormatter();
//Get the cache file path for image overlay
string strImageOverlayFileName = Convert.ToString(dvResetOverlayShapeRows[i32RowCount][_strImageDataFileName], System.Globalization.CultureInfo.InvariantCulture);
string strImageOverlayFilePath = _strImageOverlayCacheFolderPath + @"\" + strImageOverlayFileName;
if (File.Exists(strImageOverlayFilePath))
{
using (FileStream _clsImageOverlayFs = new FileStream(strImageOverlayFilePath, FileMode.Open))
{
//Get the overlay object from the cache file
IImageShapeBehaviour _clsImageShapeBehaviour = (IImageShapeBehaviour)clsBinaryFormatter.Deserialize(_clsImageOverlayFs);
//Update this object with new overlay ID
_clsImageShapeBehaviour.UpdateOverlayShapeID(i32NewID);
_clsImageOverlayFs.Close();
};
//Serialize the updated object back to file.
using (FileStream _clsImageOverlayFs = new FileStream(strImageOverlayFilePath, FileMode.Create))
{
clsBinaryFormatter.Serialize(_clsImageOverlayFs, _clsImageShapeBehaviour);
_clsImageOverlayFs.Close();
}
}
}
}
}
Can any one let me know how to use Parallel.For or Parallel.ForEach or Parallel.Invoke instead of sequential for in file handling. I have tried all three constructs as follows
try
{
DataView dvResetOverlayShapeRows = new DataView(_dtOverlayData);
dvResetOverlayShapeRows.RowFilter = _strMeasurementType + " = '" + SelectedMeasurementType.measurementType.ToString() + "'";
int count1 = dvResetOverlayShapeRows.Count;
//for (int i32RowCount = 0; i32RowCount < _dtOverlayData.Rows.Count; i32RowCount++)
Parallel.For(0, count1,
(i32RowCount, loopState) =>
{
try
{
lock (lockObject) //dvResetOverlayShapeRows)
{
int i32NewID = i32RowCount + 1;
dvResetOverlayShapeRows[i32RowCount][_strOverlayShapeID] = i32NewID;
//dvResetOverlayShapeRows[i32RowCount][_strOverlayShapeID] = i32NewID;
///*****************************************
///For Image Overlay
///*****************************************
//If the data is present in the memory, then update it on memory, else update it on cache file.
if (dvResetOverlayShapeRows[i32RowCount][_strImageOverlayData] != null && dvResetOverlayShapeRows[i32RowCount][_strImageOverlayData].GetType() != typeof(System.DBNull))
//if (conbgdvResetOverlayShapeRows.ElementAt(0)[i32RowCount][_strImageOverlayData] != null && conbgdvResetOverlayShapeRows.ElementAt(0)[i32RowCount][_strImageOverlayData].GetType() != typeof(System.DBNull))
{
((IImageShapeBehaviour)(dvResetOverlayShapeRows[i32RowCount][_strImageOverlayData])).UpdateOverlayShapeID(i32NewID);
}
else
{
BinaryFormatter clsBinaryFormatter = new BinaryFormatter();
IImageShapeBehaviour _clsImageShapeBehaviour = null;
//Get the cache file path for image overlay
string strImageOverlayFileName = Convert.ToString(dvResetOverlayShapeRows[i32RowCount][_strImageDataFileName], System.Globalization.CultureInfo.InvariantCulture);
string strImageOverlayFilePath = _strImageOverlayCacheFolderPath + @"\" + strImageOverlayFileName;
if (File.Exists(strImageOverlayFilePath))
{
using (FileStream _clsImageOverlayFs = new FileStream(strImageOverlayFilePath, FileMode.Open))
{
//Get the overlay object from the cache file
_clsImageShapeBehaviour = (IImageShapeBehaviour)clsBinaryFormatter.Deserialize(_clsImageOverlayFs);
//Update this object with new overlay ID
_clsImageShapeBehaviour.UpdateOverlayShapeID(i32NewID);
_clsImageOverlayFs.Close();
};
//Serialize the updated object back to file.
using (FileStream _clsImageOverlayFs = new FileStream(strImageOverlayFilePath, FileMode.Create))
{
clsBinaryFormatter.Serialize(_clsImageOverlayFs, _clsImageShapeBehaviour);
_clsImageOverlayFs.Close();
}
}
}
///*****************************************
///For Profile Or Histogram Overlay
///*****************************************
}
}
catch (Exception ex)
{
exceptions.Enqueue(ex);
}
});
if (exceptions.Count != 0)
throw new AggregateException(exceptions);
}
catch (AggregateException ex)
{
foreach (Exception inner in ex.InnerExceptions)
{
//Console.WriteLine(inner.Message);
}
}
}
I tried 2 to 3 variations still some times it hangs at lock(like lock, spinlock, mutex, Parallel.for with different delegates etc ), if lock commented racing happens.
The below is with different delegates(problem with this is, it is updating localTable but not the corresponding DataView(dvResetOverlayShapeRows) and associated DataTable(_dtOverlayData).
try
{
DataView dvResetOverlayShapeRows = new DataView(_dtOverlayData);
dvResetOverlayShapeRows.RowFilter = _strMeasurementType + " = '" + SelectedMeasurementType.measurementType.ToString() + "'";
int count1 = dvResetOverlayShapeRows.Count;
//for (int i32RowCount = 0; i32RowCount < _dtOverlayData.Rows.Count; i32RowCount++)
Parallel.For(0, count1,
() =>
{
//lock (dvResetOverlayShapeRows)
{
//Create a temp table per thread that has the same schema as the main table
return (dvResetOverlayShapeRows.ToTable().Clone());
}
},
(i32RowCount, loopState, localTable) =>
{
try
{
//lock (localTable)
{
int i32NewID = i32RowCount + 1;
localTable.Rows[i32RowCount][_strOverlayShapeID] = i32NewID;
//dvResetOverlayShapeRows[i32RowCount][_strOverlayShapeID] = i32NewID;
///*****************************************
///For Image Overlay
///*****************************************
//If the data is present in the memory, then update it on memory, else update it on cache file.
if (localTable.Rows[i32RowCount][_strImageOverlayData] != null && localTable.Rows[i32RowCount][_strImageOverlayData].GetType() != typeof(System.DBNull))
//if (conbgdvResetOverlayShapeRows.ElementAt(0)[i32RowCount][_strImageOverlayData] != null && conbgdvResetOverlayShapeRows.ElementAt(0)[i32RowCount][_strImageOverlayData].GetType() != typeof(System.DBNull))
{
((IImageShapeBehaviour)(localTable.Rows[i32RowCount][_strImageOverlayData])).UpdateOverlayShapeID(i32NewID);
}
else
{
BinaryFormatter clsBinaryFormatter = new BinaryFormatter();
IImageShapeBehaviour _clsImageShapeBehaviour = null;
//Get the cache file path for image overlay
string strImageOverlayFileName = Convert.ToString(localTable.Rows[i32RowCount][_strImageDataFileName], System.Globalization.CultureInfo.InvariantCulture);
string strImageOverlayFilePath = _strImageOverlayCacheFolderPath + @"\" + strImageOverlayFileName;
if (File.Exists(strImageOverlayFilePath))
{
using (FileStream _clsImageOverlayFs = new FileStream(strImageOverlayFilePath, FileMode.Open))
{
//Get the overlay object from the cache file
_clsImageShapeBehaviour = (IImageShapeBehaviour)clsBinaryFormatter.Deserialize(_clsImageOverlayFs);
//Update this object with new overlay ID
_clsImageShapeBehaviour.UpdateOverlayShapeID(i32NewID);
_clsImageOverlayFs.Close();
};
//Serialize the updated object back to file.
using (FileStream _clsImageOverlayFs = new FileStream(strImageOverlayFilePath, FileMode.Create))
{
clsBinaryFormatter.Serialize(_clsImageOverlayFs, _clsImageShapeBehaviour);
_clsImageOverlayFs.Close();
}
}
}
///*****************************************
///For Profile Or Histogram Overlay
///*****************************************
}
}
catch (Exception ex)
{
exceptions.Enqueue(ex);
}
return localTable;
},
(localTable) =>
{
//lock (localTable)
{
/* localTable.AcceptChanges();
//_dtOverlayData.DefaultView.RowFilter = _strMeasurementType + " = '" + SelectedMeasurementType.measurementType.ToString() + "'";
for (int i = 0; i < dvResetOverlayShapeRows.Count; i++)
{
dvResetOverlayShapeRows.Delete(i);
}
//_dtOverlayData.AcceptChanges();
//Merge in the thread local table to the master table
_dtOverlayData.Merge(localTable);
_dtOverlayData.AcceptChanges();*/
//dvResetOverlayShapeRows = localTable.AsDataView();
//_dtOverlayData.Dispose();
//_dtOverlayData.Merge(localTable);
/* }
});
if (exceptions.Count != 0)
throw new AggregateException(exceptions);
}
catch (AggregateException ex)
{
foreach (Exception inner in ex.InnerExceptions)
{
//Console.WriteLine(inner.Message);
}
}
}
But my DataView(dvResetOverlayShapeRows) and associated DataTable(_dtOverlayData) are not updated properly in case of multitasking. In normal case they are properly getting updated. I tried locking also and with different delegates and action constructs but not successful.
Can any one let me know where the problem is? Is it to do with locking(i have tried concurrent collection instead of locking, still no use), delegate or action results(because my DataView and DataTable are not getting updated)? I will be great if some one helps me with correct code. I am trying the data parallelism. Should i go for task parallelism instead?
 
Someone told me that as Disk reading/writing(driver level) is the serial activity, multitasking will not work in this case? Is this true? Because we can initiate multiple copy&operations on the same folder/drive which happens a bit faster. This is multitasking with file operations only right. 
Thanking you very much in advance.
Thanks,
Jagadeesh