String encoding format used in The Sims and The Sims2.

The length of the string is Encoded 7 bits at a time before the string.

If you use .NET for developing you can simply read those 7-Bit length encoded Strings with the builtin readString() Method of the BinaryReader Stream.

To decode the length for a 7bit-coded string, use the following logic:

  1. Read the first byte, and test the 8th bit. If its set, an extra byte is needed.
  2. Mask off the 8th bit and store the value.
  3. Repeat steps 1 and 2 for the next byte until the 8th bit is not set.
  4. Left-shift the second byte by 7, and add to the first byte.
  5. Repeat step 4 for each extra byte, increasing the shift by 7 each time.

The final result will be the proper length for the string. An example is as follows, assuming the first two bytes are:

0xE0 0x03

> (0x00E0 & 0x007F) + (0x0003<<7) = 0x0060 + 0x0180 = 0x01E0

The length of a string with the first two bytes 0xE0 and 0x03 is 0x01E0, or 480 characters. Another example could be:

0xD8 0x0D

> (0x00D8 & 0x007F) + (0x000D<<7) = 0x0058 + 0x680 = 0x6D8

The length of the string is 1752 characters.

Sample code:

int stringLength = 0;

int nextByte = 0;
int i=0;
while( ( (nextByte = readUnsignedByte()) & 0x80) != 0)
stringLength |= (nextByte & 0x7F) << (7*i);
stringLength |= (nextByte & 0x7F) << (7*i);

