gotchaMinor
Why does little endian apply to numbers and not to text strings?
Viewed 0 times
whyapplynumberstextendianstringslittledoesandnot
Problem
The Portable Executable file format is the format that Windows EXE files use.
It is a binary format.
Numbers are in little endian form. Thus, the following hex represents the decimal number 256, not 1.
Some fields in the file format represent text strings. For example, the "Name" field contains a null-terminated string that has up to 8 characters. Here are the hex bytes for a name and in parentheses the corresponding string:
My question is this: The file's byte order is little endian and therefore the bytes in a number are interpreted from right-to-left. Why aren't the bytes in a string interpreted right-to-left? Why aren't the bytes for the above string this:
Notice that I reversed the order of the bytes. That's not how it is in EXE files, but why not? Why does "little endian" apply only to numbers and not to text strings?
It is a binary format.
Numbers are in little endian form. Thus, the following hex represents the decimal number 256, not 1.
00 01Some fields in the file format represent text strings. For example, the "Name" field contains a null-terminated string that has up to 8 characters. Here are the hex bytes for a name and in parentheses the corresponding string:
2E 64 61 74 61 00 00 00 (.data)My question is this: The file's byte order is little endian and therefore the bytes in a number are interpreted from right-to-left. Why aren't the bytes in a string interpreted right-to-left? Why aren't the bytes for the above string this:
00 00 00 63 74 61 64 2ENotice that I reversed the order of the bytes. That's not how it is in EXE files, but why not? Why does "little endian" apply only to numbers and not to text strings?
Solution
The premise is wrong. Unicode encodings include UTF-16BE, UTF-16LE, UTF-32BE and UTF32-LE. Only UTF-8 has no Litte-Endian or Big-Endian variants.
Fundamentally, Endian-ness is about the byte order of multi-byte words, and you're thinking of text encodings which use a single byte per character (i.e. ASCII). There's no order to a single byte.
Fundamentally, Endian-ness is about the byte order of multi-byte words, and you're thinking of text encodings which use a single byte per character (i.e. ASCII). There's no order to a single byte.
Context
StackExchange Computer Science Q#103168, answer score: 6
Revisions (0)
No revisions yet.