Run Length Encoding ( RLE ) for Redundant Data

Run Length Encoding ( RLE ) for Redundant Data


Run Length Encoding ( RLE ) can be described in detail on wikipedia -> http://en.wikipedia.org/wiki/Run-length_encoding. Basically, the idea is that you encode data that is highly redundant by nature. One good example of this is a more or less static image - or some portion of a GUI. The encoding scheme takes the data and compresses it by examining redundant data, marking it with a Key ( or 'escape' ), and then giving that specific data a 'count' of how many times it is repeated. So, say that you take a .PNG image and strip the leading header from it so that you only have the raw pixel data -- suppose it is 32-bit RGBA ( Red / Green / Blue / Alpha ). Now suppose that the first 100 pixels are all black and have a 0 alpha value ( so they are all R=0 G=0 B=0 A=0 ).


Data looks like:

0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 <---- Pixel Data

R G B A ... ... R G B A <---- RGBA Values


You can greatly compress this data by marking it with a Key, then placing the pixel value there, and following up with a count of the repetitions.


Suppose we choose our escape 'Key' to be a 32bit value that is never seen in any image we expect:

255 255 0 0 <----- Escape 'Key' , Yellow with 0 Alpha


Compression then looks like:

255 255 0 0 0 0 0 0 100 <---- 9 Bytes to describe the compressed data

[ Key - Yellow ] ... [R G B A] ... [Count]


If you continue this trend , the image will become greatly compressed. To Un-Compress the data, just do the reverse. Look for the key, then figure out the pixel values associated with the key, and then finally repeat that pixel value as many times as your count describes.


Below is a really simple example of RLE encoding and decoding of a PNG image in Qt. First read in the binary data of the PNG file without a header:


// Write the stored data to file

QFile file("/home/user/Desktop/binary.bin");

file.open(QIODevice::ReadOnly);

QByteArray byteArray;

byteArray = file.readAll();

file.close();


Now create a RLE Encoder class with methods 'encode' and 'decode':

QByteArray RLE_encoder::encode( QByteArray imageData ) {


qDebug() << "\n\nRead file is: " << imageData.length() << " bytes...";


// Save it to an image file so we can compare with our decoded stream

QImage receivedImage = QImage( (uchar *)imageData.data(), 640, 480, QImage::Format_ARGB32 );

receivedImage.save(QString("/home/pclass/Desktop/A.png"));


// Get the raw data in the bytearray

char *binaryData = imageData.data();


// Define a QByteArray to contain our compressed data

char compressedImageData[imageData.length()];


// Define an escape key

uchar escapeKey[] = {255,255,0,0}; // RGBA


// Keep track of the current sequence

uchar currentSequence[] = {255,255,0,0}; // Init to Yellow with No Alpha...


// Loop through the data 4 bytes at a time

int compressedCounter = 0; // Keep track of bytes being inserted into compressed array

int sequenceCounter = 0; // Count the number of times a sequence occurs

for ( int i = 0 ; i < imageData.length() ; i += 4 ) {


// For the very first iteration, grab the RGB values to compare against

if ( i == 0 ) {


// Now grab the new sequence

currentSequence[0] = (uchar)binaryData[i];

currentSequence[1] = (uchar)binaryData[i+1];

currentSequence[2] = (uchar)binaryData[i+2];

currentSequence[3] = (uchar)binaryData[i+3];


}


// Extract RGB components

uchar r = (uchar)binaryData[i];

uchar g = (uchar)binaryData[i+1];

uchar b = (uchar)binaryData[i+2];

uchar a = (uchar)binaryData[i+3];


// Check the current sequence against the last one

if ( currentSequence[0] == r && currentSequence[1] == g && currentSequence[2] == b && currentSequence[3] == a ) {


// Increment our counter of repeated pixels

sequenceCounter++;


}

// If the new set of RGBA and old set do not match or we reached the end of our dataset, encode that data

if ( (currentSequence[0] != r || currentSequence[1] != g || currentSequence[2] != b || currentSequence[3] != a) || (i >= imageData.length()-4)) {


// At this point we need the previous RGBA values since we compared against the most recent set

uchar previousR = currentSequence[0];

uchar previousG = currentSequence[1];

uchar previousB = currentSequence[2];

uchar previousA = currentSequence[3];


// Handle where the count is higher than 255

if ( sequenceCounter > 255 ) {


// Split value counter over 255 into multiple entries in our byte array

int remainder = sequenceCounter;

while ( remainder > 255 ) {


// Using the data we have now, insert a new RLE entry

compressedCounter = create_rle_entry( compressedImageData , 255 , compressedCounter, previousR, previousG, previousB, previousA, escapeKey );


// Figure out the remainder

remainder = remainder - 255;


}


// Grab the last [remainder] repeated sequences if there are any

if ( remainder > 0 ) {


// Using the data we have now, insert a new RLE entry

compressedCounter = create_rle_entry( compressedImageData , remainder , compressedCounter, previousR, previousG, previousB, previousA, escapeKey );


}


}

else {


// Using the data we have now, insert a new RLE entry

compressedCounter = create_rle_entry( compressedImageData , sequenceCounter , compressedCounter, previousR, previousG, previousB, previousA, escapeKey );


}


// Reset the counter

sequenceCounter = 1;


}


// Now grab the new sequence

currentSequence[0] = r;

currentSequence[1] = g;

currentSequence[2] = b;

currentSequence[3] = a;


}


qDebug() << "Done encoding... " << compressedCounter << "\n\n";


for ( int a = compressedCounter-18 ; a < compressedCounter ; a++ ) {

//qDebug() << "BAD DATA: " << (uchar)compressedImageData[a];

}


// Return the compressed QByteArray

return QByteArray::fromRawData( compressedImageData , compressedCounter );


}


int RLE_encoder::create_rle_entry( char *compressedImageData , int sequenceCounter , int indexForData, uchar r, uchar g, uchar b, uchar a, uchar *escapeKey ) {


// Put in the Escape ( 0 Alpha and Yellow ) for the next one

compressedImageData[indexForData++] = escapeKey[0];

compressedImageData[indexForData++] = escapeKey[1];

compressedImageData[indexForData++] = escapeKey[2];

compressedImageData[indexForData++] = escapeKey[3];


// Put in the Value ( RGBA )

compressedImageData[indexForData++] = r;

compressedImageData[indexForData++] = g;

compressedImageData[indexForData++] = b;

compressedImageData[indexForData++] = a;


// Now stuff the counted number of repeated sequences in the tracker

compressedImageData[indexForData++] = sequenceCounter;


// Return the new index

return indexForData;


}

Decoding is straightforward as well:

QByteArray rle_encoder::decode( QByteArray compressed , int uncompressedStreamSize ) {


qDebug() << "Decompressing RLE stream...";

qDebug() << "Compressed data length: " << compressed.length();

qDebug() << "Uncompressed Stream Size: " << uncompressedStreamSize;


// Get the data from the QByteArray

const char* compressData = compressed.data();


// Define an escape key

uchar escapeKey[] = {255,255,0,0}; // Init to Yellow with No Alpha...


// Define the expanded stream

char decompressedData[uncompressedStreamSize];


// Loop through to decompress the stream

int i = 0; // track our loop iterations

int j = 0; // track our decompressed stream index

int p = 0;

while ( i <= compressed.length() ) {


// Extract four components to look for the escape key

uchar k0 = (uchar)compressData[i++];

uchar k1 = (uchar)compressData[i++];

uchar k2 = (uchar)compressData[i++];

uchar k3 = (uchar)compressData[i++];


// Compare to look for an escape key

if ( k0 == escapeKey[0] && k1 == escapeKey[1] && k2 == escapeKey[2] && k3 == escapeKey[3] ) {


// Since we found an escape key , extract the RGBA values

uchar r = (uchar)compressData[i++];

uchar g = (uchar)compressData[i++];

uchar b = (uchar)compressData[i++];

uchar a = (uchar)compressData[i++];


// Next grab the count of these values

uchar rleCount = (uchar)compressData[i++];

qDebug() << "RLECOUNT: " << rleCount;

p+=rleCount;


// Expand our count and values into the decompressed

for ( int k = 0 ; k < rleCount ; k++ ) {


// Expand our data into the decompressed stream

decompressedData[j++] = (uchar)r;

decompressedData[j++] = (uchar)g;

decompressedData[j++] = (uchar)b;

decompressedData[j++] = (uchar)a;


}

}

}


qDebug() << "P IS " << p;

qDebug() << "P*4 IS " << p*4;


// ByteArray data to return

QByteArray dataToReturn = QByteArray::fromRawData( decompressedData , uncompressedStreamSize );


// Return the QByteArray

return dataToReturn;


}

Now you can open an image, encode it with RLE, decode it, save it as a new image, and the compare the original and new images. Happy Coding :)




ClassyBits 2016