How to convert UInt32 to legible text?
Posted by dtygggbhjnnhjj099695@reddit | learnprogramming | View on Reddit | 6 comments
I'm a Dead Rising 3 modder, and the way the game stores strings is in UInt32 formats. This is also how the game stores its animation ID's, which don't actually seem to correlate to anything (Example, the animation "player_attack_heavymetal_heavy_spin" converted to UInt32 is "2350023456", while its actual animation ID is "950460626")
This means that if the animation isn't referenced anywhere else in the code besides the animation file itself, I can't use it. So, I've been trying to reverse engineer the strings, but I haven't gotten any luck, just getting 4 illegible characters. Does anyone have a way to help? Is it impossible?
teraflop@reddit
A 32-bit integer can't encode more than 4 ASCII characters. There simply aren't enough bits.
It might be that in your application, there exists some kind of mapping table that associates each ID with a corresponding string. If such a table exists, it's up to you to find it; if it doesn't exist, there's not much you can do about it. This kind of thing is why reverse-engineering is hard!
If whatever "animation file" you're working with contains a list of animations, you could manually go through them and assign your own meaningful names to them.
I have no idea what you mean by "converted to UInt32" here.
dtygggbhjnnhjj099695@reddit (OP)
When packing files using a modding tool designed for the game, the code will automatically be compressed, and strings are automatically translated into a Uint32 decimal number.
teraflop@reddit
That sounds like it's probably some kind of hash function, probably a very simple one if the hash of a 1-character string is equal to the character's ASCII code.
Unfortunately, a hash function like that is not going to be reversible for strings that are more than 4 characters long, for a very simple reason. The number of, say, 10-character strings is many billions of times larger than the number of possible 32-bit integers. So each integer must correspond to many billions of possible strings, and without more information, there is no way of knowing which one it originally was. (See the pigeonhole principle for a more formal explanation.)
The best you can hope for is that the original string is stored elsewhere somewhere in the data files that you haven't found yet.
Able_Mail9167@reddit
To add to this, I have a strong feeling they do this for language support. I'd look to see if they're loading lang files anywhere.
sparant76@reddit
It can encode a lot more if you limit to a useful subset of ascii characters. If u just want 26 non cased alphabet chars, you only need 5 bits, and you can store 6 of those in 32 bits.
teraflop@reddit
Yeah, if you want to get technical then that buys you a couple more characters. But the number of possible strings grows exponentially with the string length. OP was asking about a 35-character string and there's no way that can possibly fit, unless there's some weird ad-hoc encoding scheme that encodes predefined substrings instead of individual characters.