Blog


The byte order fallacy (part 2)

Rob Pike's excellent criticisms of typical byte ordering and re-ordering conventions led me to implement some of his cleaner and more efficient methods in my own code. I went back to cut out all the redundant ugly byte swapping that used arrays, helper files and gnu extensions that was cluttering the code and potentially leading to compatability issues. Including some very useful lines from Gabriel Staples that used pointers to swap the byte in place.

My project involved reading bytes from a specific file format that happened to be big endian. Most of the data was were 16 bit unsigned integers, and very quick and easy to load into the memory robustly using Rob's method. Unfortunately the metadata included double precision floating point integers, where we run into a few immediate problems trying to implement Rob's method. Attempting to compile with gcc I found:

  1. You can't do bitwise operations on floats
  2. You can't shift the most signigicant bits far enough
  3. You can't directly cast binary values

Thankfully these issues are reasonably easy to overcome, we just need to read the bytes into an array and cast each byte to an unsigned 64 bit integer (uint64_t). This gives us enough space to shift the bits the necessary amount and glue them all together to make one 64 bit integer that now contains all the bytes in the correct order.

First let's create an array of bytes called CurrentDouble that is just big enough to store all the bytes in a double, and a 64 bit integer (long long) big enough to store all the same bytes but in just one variable:

uint8_t CurrentDouble[sizeof(double)];

uint64_t CurrentDoubleAsLongLong;

Now we can read one byte at a time into that array "size of double" times (ie 8 times) from the input stream "inptr":

fread(CurrentDouble, sizeof(uint8_t), sizeof(double), inptr);

Then shift the bytes and pack them all together into the uint64_t variable:

CurrentDoubleAsLongLong = (uint64_t)CurrentDouble[7] << 0 | (uint64_t)CurrentDouble[6] << 8 | (uint64_t)CurrentDouble[5] << 16 | (uint64_t)CurrentDouble[4] << 24 | (uint64_t)CurrentDouble[3] << 32 | (uint64_t)CurrentDouble[2] << 40 | (uint64_t)CurrentDouble[1] << 48 | (uint64_t)CurrentDouble[0] << 56;

Perfect, we're finished, that was easy, all we have to do now is cast the uint64_t back to a double since they (should be) both the same number of bytes. Oops, wait, hang on, no, that didn't work, what's going on here? Well, when you cast an int to a float, you're telling the computer to turn a binary representation of a specific integer into a binary representation of a float with the same value as the integer. But we never had an integer, we had a float that we pretended was an integer for convenience, and now the compiler interprets it as some huge integer and converts it to a new float of the same (huge) value, not quite what we wanted.

Since casting changes the binary values we read from the file, we have to find a way of tricking the computer into thinking that the uint64_t is a float without actually casting it. To do this we can create a pointer, and tell the compiler that this pointer is pointing to a double:

double * temporary_pointer;

Then set this pointer to point at the address of the integer variable holding our float data:

temporary_pointer = &CurrentDoubleAsLongLong

This will work, but gcc will throw a warning about incompatible types, which can be avoided by explicitly casting the address to a double*:

temporary_pointer = (double *)&CurentDoubleAsLongLong

Now we have a pointer to an integer holding our binary values, and we have said to the compiler "hey, this pointer is pointing to a double, I swear" so when we dereference it, it will be correctly interpreted as such.

double MyDouble = *temporary_pointer

Now we have a double called "MyDouble" that contains whatever binary data was stored in "MyDoubleAsLongLong" exactly as we assigned it above. This is deliberately verbose for the sake of explanation. All this pointer magic can be tidied into just one simple line that casts and dereferences the address in one go:

double MyDouble = *(double *)&CurrentDoubleAsLongLong;

printf("The value of MyDouble is %lf\n" MyDouble);