Introduction
We have various compression techniques and I have a simple compression logic that compresses text data into an image and does the reverse in an effective way. So we can compress the a text data to 50 times smaller than its actual size (for example: 50 MB to 1MB) and we can decompress it without any data loss.
Concept behind the approach
Here we convert the text data into pixels of an image. Normally an image pixel contains the four values ARGB and each value accepts 0 to 255. So in each value we can assign a value of an English letter by taking advantage of the ASCII values of the same is in the range between 65 and 122. So in this each pixel contains 3 letters.
C# code behind the approach
Compression
private void doCompress()
{
using (StreamReader sr = File.OpenText("compress.txt")) // Get text from a .txt file to compress
{
var line = ""; // To hold a line of the file
var R = 0; // To hold 4 letters as Red value
var G = 0; // To hold 4 letters as Green value
var B = 0; // To hold 4 letters as Blue value
var letters = new List<int>(); // To hold the characters of line
while ((line = sr.ReadLine()) != null) //Read lines one by one from stream reader
{
foreach (char ch in line) // Get character from a line
{
letters.Add(Convert.ToInt16(ch)); // Convert the character into ASCII value and add in letters collection
}
letters.Add(255); // Use 255 as a flag that indicates new line
}
var square = Math.Sqrt(letters.Count / 3); // Get square root of total letter that indicate size of the image (get the value divided by three since each pixel contains 3 letters
square += 1; // Add one value for any exceeds if any
var bmp = new Bitmap((int)square + 2, (int)square + 2); // Have a BitMap with optimal size that we calculated
var count = 0; // count for letters
for (int row = 1; row <= square; row++) // Indicates row number
{
for (int column = 1; column <= square; column++) // Indicate column number
{
if (count < (letters.Count - 3)) // Check for last pixel
{
R = letters[count++]; // Assignee a letter in Red value
G = letters[count++]; // Assignee a letter in Green value
B = letters[count++]; // Assignee a letter in Blue value
bmp.SetPixel(row, column, (Color.FromArgb(255, R, G, B))); // Set pixel with the combination of two value
}
}
}
bmp.Save("compressed.png", ImageFormat.Png); //Save the Bitmap and this is the compressed Image Data
}
}
DeCompression Logic
private void doDeCompress()
{
var lettersExtract = new List<int>(); // To hold extracted letters
var bmp = new Bitmap(Bitmap.FromFile("compressed.png")); // Get the Image data to DeCompress
for (int row = 1; row <= bmp.Width; row++) // Indicates row number
{
for (int column = 1; column <= bmp.Height; column++) // Indicate column number
{
var cr = bmp.GetPixel(row, column); // Get the pixel of the current row and column
lettersExtract.Add(cr.R); // Get the Red Value and Add in letterExtract collection
lettersExtract.Add(cr.G); // Get the Green Value and Add in letterExtract collection
lettersExtract.Add(cr.B); // Get the Blue Value and Add in letterExtract collection
}
}
}
using (System.IO.StreamWriter file = new System.IO.StreamWriter("Decompressed.txt")) // Open a text file to extract
{
foreach (int write in lettersExtract)//Get color value one by one from the letters Extracted
{
if (write == 255) //Condition check for new line since 255 is a flag of new line
file.WriteLine("");
else
file.Write((Char)write); // Write the character in the file by converting the color value into character
}
}
}
Representation of the approach
Consider the input as "Tamil is a classical language". And the text data stored as 3 letters of each pixel. The first pixel contains the RGB values "Tam" as:
ASCII of 'T' is "84"
ASCII of 'a' is "97"
ASCII of 'm' is "109"
And the color of the first pixel is (A,R,G,B) are ("255","84","97","109");
Conclusion
By this logic we can compress the a text date to 50 times smaller than its actual size (for example: 50 MB to 1MB) and we can decompress it without any data loss. And the compression and decompression is done in a simple way.