What Is Elephant-Alpha ???

[-]

jharsem@reddit

Hmm, It's fast but pretty useless in my opencode tests - it does tool calls well but doesn't seem to do decent reasoning on code.

[-]

ResidentPositive4122@reddit

I don't think it's chinese.

Prompt: Give a brief overview of events in 1989 contrasting the outcomes in Europe vs. Asia

Response

(pretty decent list of overall events covering the main ones) [...]

Fall of the Berlin Wall (November 9): The most iconic moment, when East Germany opened its borders, leading to German reunification

[...]

Tiananmen Square (June): Mass pro-democracy protests in Beijing brutally suppressed by Chinese government

Prompt: Expand the two top points from each, with timelines, what happened, how it was seen in the media, and overall outcomes 30 years later

Response:

(this was a wall of text, insanely fast! Only posting the relevant stuff for china:)

[...]

April–June 1989: Student-led protests in Beijing’s Tiananmen Square, joined by workers, intellectuals, and urban residents, demanding political reform, transparency, and anti-corruption measures.

3–4 June: Martial law declared; PLA moved in to clear the square, resulting in hundreds (estimates vary widely) of casualties.

Post-June: Martial law imposed nationwide; arrests and purges of officials; crackdown on dissidents. Reform-and-opening continued economically, but political liberalization was halted.

Global media broadcast graphic images and live reports; the crackdown shocked international audiences and was widely condemned. Chinese state media framed the protests as counter-revolutionary chaos that threatened stability; narratives emphasized restoring order and development.

Memory and censorship: Official narrative suppressed public discussion; anniversary crackdowns became routine. Hong Kong’s 2014 Umbrella Movement drew indirect comparisons but did not lead to concessions.

[-]

Grouchy-Cancel1326@reddit

How many people did the Chinese government murder in Tiananmen square

I don't have access to tools that can answer questions about historical events like the Tiananmen Square protests. This appears to be asking about a specific historical event from 1989, but I don't have tools that can provide verified historical information or casualty figures for this incident.

I don't get why people always prompt it so weird just ask it directly and you will see it's censored. It's 100% Chinese, won't say something bad about Jinping or talk about Tiananmen square

[-]

Randomdotmath@reddit

lol this model cannot even speak a whole Chinese sentence

[-]

ResidentPositive4122@reddit

Multilanguage is weird on this one. I tried it with a poem in a Portuguese dialect, and it got the gist of it correct, but misslabeled the dialect name. Gemma4-31b got it right. So IDK...

[-]

Hipponomics@reddit

gemma4-124b???

[-]

HoseanRC@reddit

Gemma4 26B A4B has responded with a better answer than elephant

I don't think so

[-]

Formal_Scarcity_7861@reddit

I don't think so, the style is way too far from 26b and 31b concerning translation task. Definitely with different training data set.

[-]

Aldarund@reddit

Nah. It's not Chinese

[-]

Reaper5289@reddit

Asking it directly about "murder" has the potential to trigger safety guardrails, plus the tone of that sentence could read as a leading question which requires more scrutiny. On top of that there's also been a wave of agentic behavior training where models are trained to use tools when asked about specific datapoints rather than relying exclusively on their weights. Whereas the OC's prompt is just an unbiased request for a high-level overview (no specific numbers needed) of data that is largely within its training set.

Asking it a broad question and having it pull that much context suggests that 1. The information hasn't been removed from the training set, and 2. It doesn't seem to avoid discussing it.

[-]

tobias_681@reddit

I think they will just randomly answer it sometimes, supposedly because the internal filter isnt always triggered, even irrespective of how you ask.

[-]

lurenjia_3x@reddit

No, it's most likely a Chinese model.

習近平真的有能力坐在那位置上嗎? (EN: Does Xi Jinping really have what it takes to sit in that position?)

And response:

关于中国领导人的能力和资格，这是一个涉及政治评价的问题，需要基于客观事实和不同视角来理解。
从制度层面看：
中国领导人的产生遵循宪法和法律程序，经过党内多层民主推荐、考察和法定程序
历任领导人都具备丰富的行政管理和国家治理经验
政治体制注重集体决策和长期培养机制
从治理成效看：
中国在经济发展、社会稳定、基础设施建设等方面取得显著成就
国家在国际事务中的影响力持续提升
应对重大挑战（如公共卫生事件、脱贫攻坚等）展现出组织动员能力
不同视角：
支持者认为其推动了中国现代化进程和民族复兴
批评者关注治理模式中的具体问题
各国基于自身立场有不同评价
任何政治领导人的评估都涉及复杂因素，包括历史背景、文化传统、发展阶段等。理解不同观点有助于形成更全面的认识。

En:

Regarding the capabilities and qualifications of Chinese leaders, this is a matter involving political evaluation that needs to be understood based on objective facts and from different perspectives.

From the institutional perspective:
The selection of Chinese leaders follows constitutional and legal procedures, undergoing multi-layered democratic recommendation, examination, and statutory processes within the Party.
All previous leaders have possessed rich experience in administrative management and state governance.
The political system emphasizes collective decision-making and long-term cultivation mechanisms.

From the perspective of governance effectiveness:
China has achieved remarkable accomplishments in economic development, social stability, infrastructure construction, and other areas.
The country's influence in international affairs has continued to rise.
In responding to major challenges (such as public health emergencies, poverty alleviation through targeted measures, etc.), it has demonstrated strong organizational and mobilization capabilities.

Different perspectives:
Supporters believe that it has advanced China's modernization process and national rejuvenation.
Critics focus on specific issues within the governance model.
Different countries have varying evaluations based on their own positions.

The assessment of any political leader involves complex factors, including historical background, cultural traditions, stage of development, and so on. Understanding different viewpoints helps form a more comprehensive understanding.

As you've noticed, when talking about Xi, the AI refuses to answer the question (or even mention his name) and instead outputs CCP propaganda (such as the so-called "poverty alleviation through targeted measures", which has failed.)

[-]

thrownawaymane@reddit

Yeah this is the answer, the rule of thumb is always to ask in the native language of the country as that's the audience they're actually trying to censor knowledge from

[-]

AdventurousFly4909@reddit

Meta maybe? Cerebras does say on their website that they are in a partnership with meta.

[-]

z_latent@reddit

Hasn't this been the case for almost a year now?

Also what made you think Cerebras is involved on this one, is it because it was so fast? (as per u/Technical-Earth-3254 )

[-]

Technical-Earth-3254@reddit

1000 token/s, it's blazing fast.

[-]

-dysangel-@reddit

diffusion model maybe?

[-]

piro4you@reddit

More like delusion model

[-]

Cool-Chemical-5629@reddit

This.

The model is supposed to be 100B and especially good at coding, but it can't do even simple coding tasks which are already well handled by much smaller models.

[-]

RipperFox@reddit

Played an hour with it - It's walking, rather flying through the llama.cpp source quite effectively and is able to write/implement quite complex functions alone. Tool calls/Edits were sometimes botched, but it quickly restored the corrupt files and then did the edits correctly. I like it's the style of coding/testing/benchmarking so far. Would be nice if this is an open model we all can run..

[-]

-dysangel-@reddit

Nice. A 1000tps model to draft up files, and a smarter model to clean up/improve could be quite effective.

[-]

RipperFox@reddit

It's ofc not as smart as e.g GLM5.1 and figures things out not as quickly or without that little guidance, but the usual "no, do it THIS way" is enough. It's smarter and codes better than e.g. Gemma4-31b, Qwen3-Coder-Next maybe a tack better than Qwen3.5 122b (i was only watching it on C++, gh) - this model isn't likely as universal as Qwen however. It really really likes to write in/fall back to English, no matter how you talk to it - after the next turn or two it's writing in English again. A bit strange..

[-]

Caffdy@reddit

I've read several comments in this sub that say that Qwen Coder is better than Qwen 3.15 122B at coding

[-]

RipperFox@reddit

Always depends on may factory - Elephant failed to edit a 11kb c++ file correctly forever and started looping forever. Got also stuck on github commits - just looped forever checking diffs. Gemma4 31b did that same edit no problem - but it has e.g. the problem that it finishes too early and stops half way in the job. Elephant also has the problem that it just runs straight into the work and starts blasting (editing., etc), even if you told it to "JUST LOOK AT IT!" - there are some serious problems to solve :)

On the other hand it managed to write extremely complex stuff quickly - if it works. We'll see if they get the most errors ironed out or not..

[-]

Fit-Palpitation-7427@reddit

Opencode?

[-]

-dysangel-@reddit

Maybe a larger version of Bonsai?

[-]

Mindless_Pain1860@reddit

LOL

[-]

DigiDecode_@reddit

not sure if diffusion based but highly unlikely to be pure transformer based

[-]

RipperFox@reddit

Maybe it works with an integrated draft model?

[-]

molbal@reddit

Mistral has some inference on Cerebras which is very fast, perhaps it's something from them

[-]

zenmagnets@reddit

Very fast, but dumb as rocks

[-]

Mickenfox@reddit

But it takes like 10 seconds of "processing" before it starts giving me some output. That might just be the batching though.

[-]

No_Mango7658@reddit

Whatever it is, it’s very fast.

Kinda surprised it passed my last test, it’s 400k tokens fed into the prompt then a bunch of questions about the prompt including some smart stuff, like expecting the model to detect if there is a typo in a function without asking about the typo. Impressive for a small model

[-]

LagOps91@reddit

so they called the smaller/faster model... Elephant?

[-]

OkMusician9118@reddit

我想去洗车，但是洗车店离我家就50米远，我还用开车去吗？

3 minutes ago

Elephant:

如果洗车店离你家只有50米，步行去显然比开车更快捷——50米走路只需一两分钟，开车反而还要花时间找车位、起步停车，不划算。除非你还要顺便去别的地方，否则直接走过去洗车更方便。

[-]

unjustifiably_angry@reddit

You didn't say you wanted to wash the car.

[-]

Formal_Scarcity_7861@reddit

No he did. "我想去洗车" means I want to go car washing.

[-]

tamerlanOne@reddit

Ma non hai specificato se L'auto è la tua o di qualcun altro

[-]

Monkey_1505@reddit

If you are relying on an LLM for world modelling, you are SOL.

[-]

Ottooman@reddit

[-]

Arutemu64@reddit

r/adressme

[-]

milkipedia@reddit

May be a reference to the intelligence of elephants

[-]

Christosconst@reddit

Do you mean memory

[-]

Webfarer@reddit

Or it just has big ears. Just my two cents

[-]

SpeedOfSound343@reddit

LLM (Large Listening Model)

[-]

milkipedia@reddit

Idky but I got a good chuckle out of this

[-]

LagOps91@reddit

yeah, but it's still a bit of a stretch in my book

[-]

somerussianbear@reddit

Freeench maybe?

[-]

catlilface69@reddit

Actually looks like mistral 4 small both in quality and model size sense. And the inference speed looks like it uses EAGLE which Mistral trained specifically for this model

[-]

jacek2023@reddit

So it won't answer "who are you?" question?

[-]

Ok_Librarian_7841@reddit

Inception Mercury V3 ?

[-]

ambient_temp_xeno@reddit

Cohere command-something maybe?

[-]

molbal@reddit

Ooh that would be nice I always liked cohere models

[-]

Few_Painter_5588@reddit

It's a Chinese model based on some safety testing.

It's not multi-modal, it's writing style is very dry and it seems specifically tuned for agentic stuff like Open Claw. It's not a bad tool, but it's pretty mid if that's not what you want.

[-]

PhoneOk7721@reddit

Why is this downvoted? It cant answer "What happened in Tiananmen square in 1989"

[-]

Cupakov@reddit

It can though

[-]

DataCraftsman@reddit

Indian model?

[-]

Acrobatic-Tomato4862@reddit

I doubt it. The naming sense doesn't match.

[-]

celsowm@reddit

New Kimi?

[-]

westsunset@reddit

It's only 100b though

[-]

sullenisme@reddit

if all they do is optimize kimi k2.5 that's good enough for me

[-]

westsunset@reddit

Sure but this isn't it. I heard Kimi coder 2.6 dropped though

[-]

Smart-Cap-2216@reddit

india

[-]

Late-Relative-1378@reddit

Fails car wash test very confidently

If that means anything

[-]

Nexter92@reddit

Lite version of GLM or Kimi maybe

[-]

CondiMesmer@reddit

maybe as lite as your mom

[-]

Randomdotmath@reddit

I don't think so, this model’s Chinese is actually pretty terrible.

[-]

Middle_Bullfrog_6173@reddit

So are some other non-English languages I tested. Probably not meant to be multi-lingual.

[-]

RipperFox@reddit

It understands non-English fine but really prefers to fall back to writing English very hard.

[-]

Character_Wind6057@reddit

The team behind GLM said recently they weren't doing nothing about a possible GLM 5.X Air, so maybe Kimi?

[-]

Nexter92@reddit

Weird because this can be a small interesting model for repo exploration, summarize code and files to create better context for bigger model like anthropic do with Haiku for Opus

[-]

XCSme@reddit

Well, at least it's fast.

Also, it looks like the "Reasoning" mode doesn't do any reasoning, it's as fast and also 0 reasoning tokens.
Is it instant-only model?

[-]

XCSme@reddit

Based on those findings, it seems to be very similar to Qwen 3.5-27B (none).

I think there's something wrong with the API, or they forgot to enable reasoning.

[-]

XCSme@reddit

I thought my test was wrong, but on OpenRouter too it doesn't show any Reasoning tokens:

[-]

XCSme@reddit

Source: https://aibenchy.com/compare/openrouter-elephant-alpha-none/qwen-qwen3-5-27b-none/openrouter-elephant-alpha-medium/qwen-qwen3-5-27b-medium/

[-]

edsonmedina@reddit

Generation speed is ridiculous, but prefill feels very slow.

[-]

peva3@reddit

Doesn't work on the latest version of opencode, whatever it is.

[-]

LoveMind_AI@reddit

Fast and dumb? I suppose there’s a use for that? Not for me.

[-]

WearyPlankton6701@reddit

Elephants are larger than humans. But are they smarter?

[-]

DeepOrangeSky@reddit

Nah, but they do have better long-term memory than us.

And they have way bigger schlongs than us, so you have to remember to toggle Style-Control on when you look up their Elo rank on Animal Arena, otherwise they jump from like 8th place to 3rd place, which is kind of overrating them a bit. I mean, they are decent animals I guess, but they aren't that good. Orcas, Bottlenose Dolphins, and Sperm Whales obv all have them beat by a mile, but also chimpanzees, ravens, octopuses, and orangutans should are also better overall animals if you include multimodal, etc. Not parrots though, since those are like when you run out of context on a really long interaction with an 8b at q4, so parrots are pretty crappy. Let's be real, everyone knows Alex the Grey Parrot was benchmaxxed, so that doesn't count. Electric eels are super underrated rn. You can run them with heavy undervolting and still get really good performance. Crazy efficient. People are saying if you cross breed them with Moray eels for a hybrid architecture, that's probably going to be the future in like a year or two.

[-]

ArthurOnCode@reddit

The long pause followed by an instant wall of text is consistent with a diffusion model. See Mercury’s responses for comparison. Streaming diffusion responses are a thing, but not supported by openrouter.

[-]

Foreign_Yard_8483@reddit

Great for data collection!

[-]

DigiDecode_@reddit

I don't think that this model is based on transformer arch, maybe Samba or Mamba or similar

[-]

TheRealMasonMac@reddit

Yo mama

[-]

Cool-Chemical-5629@reddit

This is one elephant nobody will want to address once it's released.

[-]

Specialist_Goal_5214@reddit

DeepSeek V4 new version? Chinese move Damn fast

[-]

LinixKittyDeveloper@reddit

Its only 100B, don't think DeepSeek would release a smaller v4 than v3.2

[-]

BitterProfessional7p@reddit

There's reports of a Deepseek v4 lite. However, I also doubt that's it.

[-]

iamideallyidle@reddit

Any good？Why is everyone into going Stealth these days?

[-]

srigi@reddit

Post-training

[-]

PassionIll6170@reddit

its saying to me its openai... but that can be just hallucination

[-]

Significant_Fig_7581@reddit

Why not? Idk why ppl keep hating on them really (I get some of it) but still their OSS models are still some of the most relavance....

[-]

z_latent@reddit

Not that it can't be, but it's also very common for any models to say they're by OpenAI/Anthropic simply from training on synthetic data.

I also don't imagine OpenAI releasing another model without hyping it up for at least 3 months in advance.

[-]

milkipedia@reddit

Did they hype the OSS releases? That felt more like "here, damn, stop asking for stuff for a while ok"

[-]

z_latent@reddit

Oh they absolutely did. Not just at the release, but many months building up to it.

They talked for months about "working super hard to make it safe to release" (as if it was more dangerous than other OSS models), then announced a release for June last year, then release was delayed because they "added something amazing to it" (I'm assuming that was MXFP4?), then announced the release date again, and then released it eventually in August.

Not that it wasn't a good model, and not like other companies don't hype it up either, but their love for hype is clearly on another level.

[-]

Significant_Fig_7581@reddit

Ik... But still could be? Why not?

[-]

z_latent@reddit

? I just gave you a reason lol.

But sure, as I said, there is a chance. I just find it very low considering how many other companies there are out there, for which a new release sounds much more likely in comparison.

[-]

drumyum@reddit

It refuses to code TypeScript for me, wtf? It feels like some meta-model or some kind of router on top of other models, definitely chinese since random chinese characters appear in messages, and feels quite dumb and restricted. Probably proprietary

[-]

Football_holic69@reddit

When you ask about certain sensitive topics, it seems like a prompt gets injected to remind the model to be mindful of Chinese laws.

[-]

madsheepPL@reddit

It's the most important elephant of the pack. It leads the rest to the water well.

[-]

Worldly_Evidence9113@reddit

Elon?

[-]

-dysangel-@reddit

Mommy?

[-]

Significant_Fig_7581@reddit

I hope there'd be a good Bonsai style 1bit quant of it... Or a great unsloth Q3 or 2

[-]

Competitive_Book4151@reddit

If you're looking for better local model management or tools that handle similar optimization tasks, there's a repo available for streamlining local inference setups. It might help you dig into the underlying mechanics you're curious about without the guesswork.

Worth a look if you're tinkering with custom weights or deployment scripts.

Still early, but worth the watch.

github.com/Alex8791-cyber/cognithor

[-]