Difference between revisions of "7BITSTR"
m (It's essentially clean) |
(Copied in actual 7bitstr parsing logic from old DarkMatter phorum thread) |
||
Line 5: | Line 5: | ||
If you use .NET for developing you can simply read those 7-Bit length encoded Strings with the builtin readString() Method of the BinaryReader Stream. | If you use .NET for developing you can simply read those 7-Bit length encoded Strings with the builtin readString() Method of the BinaryReader Stream. | ||
− | { | + | To decode the length for a 7bit-coded string, use the following logic: |
+ | |||
+ | # Read the first byte, and test the 8th bit. If its set, an extra byte is needed. | ||
+ | # Mask off the 8th bit and store the value. | ||
+ | # Repeat steps 1 and 2 for the next byte until the 8th bit is not set. | ||
+ | # Left-shift the second byte by 7, and add to the first byte. | ||
+ | # Repeat step 4 for each extra byte, increasing the shift by 7 each time. | ||
+ | |||
+ | The final result will be the proper length for the string. An example is as follows, assuming the first two bytes are: | ||
+ | |||
+ | <tt> | ||
+ | 0xE0 0x03 | ||
+ | |||
+ | > (0x00E0 & 0x007F) + (0x0003<<7) | ||
+ | = 0x0060 + 0x0180 | ||
+ | = 0x01E0 | ||
+ | </tt> | ||
+ | |||
+ | The length of a string with the first two bytes 0xE0 and 0x03 is 0x01E0, or 480 characters. Another example could be: | ||
+ | |||
+ | <tt> | ||
+ | 0xD8 0x0D | ||
+ | |||
+ | > (0x00D8 & 0x007F) + (0x000D<<7) | ||
+ | = 0x0058 + 0x680 | ||
+ | = 0x6D8 | ||
+ | </tt> | ||
+ | |||
+ | The length of the string is 1752 characters. | ||
+ | |||
+ | == Sample code: == | ||
+ | <tt> | ||
+ | int stringLength = 0; | ||
+ | |||
+ | int nextByte = 0;<br/> | ||
+ | int i=0;<br/> | ||
+ | while( ( (nextByte = readUnsignedByte()) & 0x80) != 0)<br/> | ||
+ | {<br/> | ||
+ | stringLength |= (nextByte & 0x7F) << (7*i);<br/> | ||
+ | i++;<br/> | ||
+ | }<br/> | ||
+ | stringLength |= (nextByte & 0x7F) << (7*i); | ||
+ | </tt> | ||
+ | |||
[[Category:Modding]] | [[Category:Modding]] | ||
[[Category:InternalFormats]] | [[Category:InternalFormats]] |
Revision as of 04:18, 11 October 2009
String encoding format used in The Sims and The Sims2.
The length of the string is Encoded 7 bits at a time before the string.
If you use .NET for developing you can simply read those 7-Bit length encoded Strings with the builtin readString() Method of the BinaryReader Stream.
To decode the length for a 7bit-coded string, use the following logic:
- Read the first byte, and test the 8th bit. If its set, an extra byte is needed.
- Mask off the 8th bit and store the value.
- Repeat steps 1 and 2 for the next byte until the 8th bit is not set.
- Left-shift the second byte by 7, and add to the first byte.
- Repeat step 4 for each extra byte, increasing the shift by 7 each time.
The final result will be the proper length for the string. An example is as follows, assuming the first two bytes are:
0xE0 0x03
> (0x00E0 & 0x007F) + (0x0003<<7) = 0x0060 + 0x0180 = 0x01E0
The length of a string with the first two bytes 0xE0 and 0x03 is 0x01E0, or 480 characters. Another example could be:
0xD8 0x0D
> (0x00D8 & 0x007F) + (0x000D<<7) = 0x0058 + 0x680 = 0x6D8
The length of the string is 1752 characters.
Sample code:
int stringLength = 0;
int nextByte = 0;
int i=0;
while( ( (nextByte = readUnsignedByte()) & 0x80) != 0)
{
stringLength |= (nextByte & 0x7F) << (7*i);
i++;
}
stringLength |= (nextByte & 0x7F) << (7*i);