[-]

ihaag@reddit

It took 2 shots to answer how many rrrrrrr’s in strawberrrrrrrry but so did Claude latest model, 2 shots asking it ‘are you sure’ I cannot wait for the open weights

[-]

If tokenizers were updated to single characters then even a 1b model would answer this correctly. It's not an intelligence issue - it's because tokens are the smallest units it can see. In the future with more processing power maybe models will tokenize each character individually, but for now, this is just not a good test of a model's intelligence. It's like me asking you how many atoms are on your left finger. You can't see them, so how could you know? Does it make you dumb if you don't give the correct answer?

[-]

EDLLT@reddit

lmfao, that's a good question.
"How many atoms are in this speck of dust"

[-]

YearZero@reddit

Human: "how am I supposed to fucking know?!"
Alien: "ahh there's no intelligent life on this planet, let's move on fellas"

[-]

Koksny@reddit

That's just base64 decoding with extra steps, is it really something that average model can't do?

[-]

EstarriolOfTheEast@reddit

It's impressive. The extra steps matter. No LLM can decode base32, despite some being champs at base64 for example. Open models also tend to be quite bad at decoding ciphers.

[-]

LetterFair6479@reddit

Yea I don't understand this at all. I tried it and it is so wierd. How can it do b64 flawlessly and it cannot do b32. Not even after giving the alphabet in utf8 and b32 encoded. What is the reason if this ?!

[-]

martinerous@reddit

Because it's still statistics-based training and not "real reasoning". b64 is by far more popular, thus more impact on training data.

[-]

LetterFair6479@reddit

Ok , I stand corrected. Chatgpt4o actually does it right. It looks like it does some clever function calling.

So, debunked. Chatgpt 4o can do base32 decoding and encoding.!!

[-]

martinerous@reddit

Somehow I doubt it has function calls for different decoders. It might just be the amount of the training data which is much larger for Chatgpt 4o than DeepSeek Lite, so includes more encoding examples.

Can 4o also do XOR cipher? :)

[-]

LetterFair6479@reddit

It started with giving this python code:

Input string and key

text = "hello world" key = "K" # XOR key (can be any character or series of characters)

XOR encryption

encrypted = ''.join(chr(ord(c) ^ ord(key)) for c in text)

Print encrypted result (in a readable way, e.g., hexadecimal)

encrypted_hex = encrypted.encode('utf-8').hex() print("Encrypted (hex):", encrypted_hex)

To decrypt:

decrypted = ''.join(chr(ord(c) ^ ord(key)) for c in encrypted) print("Decrypted:", decrypted)

Follow-up question ; "Give me the result"

The XOR-encoded result of "hello world" using the key K is:

Hexadecimal representation: 232e2727246b3c2439272f Decrypted back to original: hello world

Let me know if you need help with anything else!

Not sure if that is correct.

[-]

Inspireyd@reddit (OP)

For the previous GPT-4o to do this, I had to add some additional information. Before, it would guess a few words and leave the sentence disjointed. Gemini Exp 1114 came the closest.

[-]

lordpuddingcup@reddit

I asked it a programming question related to python and apples MLX and it doesnt know what mlx is, felt odd all the other models seem to know it, gap in knowledge dataset i guess

[-]

JealousAmoeba@reddit

It's likely a small model, not enough parameters to hold both detailed world knowledge and knowledge about how to reason.

[-]

MLDataScientist@reddit

did you come up with the question or this is an existing question from the internet?

[-]

Inspireyd@reddit (OP)

I developed the one in the screenshot myself. I have another basic example in which I simply encrypted a message and it was able to solve it without any problems.

When you go to the internet and ask for things related to the Playfair Cipher, for example, it fails miserably. I don't use o1, someone said on X that o1 can solve encrypted sentences about the Playfair Cipher.

I don't know how to do playfair ciphers, but if GPT-4o is right, the correct answer would be "Yesterday I ate pork chops".

[-]

LoadingALIAS@reddit

This just is not a great example.

Here the R1-Lite-Preview from DeepSeek AI showed its power... WTF!! This is amazing!!

Eralyon@reddit

ihaag@reddit

YearZero@reddit

EDLLT@reddit

YearZero@reddit

Koksny@reddit

EstarriolOfTheEast@reddit

LetterFair6479@reddit

martinerous@reddit

LetterFair6479@reddit

martinerous@reddit

LetterFair6479@reddit

Input string and key

XOR encryption

Print encrypted result (in a readable way, e.g., hexadecimal)

To decrypt:

Inspireyd@reddit (OP)

lordpuddingcup@reddit

JealousAmoeba@reddit

MLDataScientist@reddit

Inspireyd@reddit (OP)

LoadingALIAS@reddit