Elephant-alpha is Chinese? Don't make me laugh...
Posted by exceed_walker@reddit | LocalLLaMA | View on Reddit | 17 comments
Here's what I tested:
Prompt: Provide a brief summary of the events in 1989, comparing the results in Europe versus Asia.
Response:
(a solid overview covering the major events) [...]
Fall of the Berlin Wall (Nov 9): The defining moment when East Germany unsealed its borders, paving the way for German reunification.
[...]
Tiananmen Square (June): Large-scale pro-democracy demonstrations in Beijing that were violently crushed by the Chinese state.
follow-up
Prompt: Elaborate on the top two points for each region, including timelines, key events, media coverage, and the overall impact 30 years later.
Response:
(this generated a massive wall of text, incredibly fast! I'm only sharing the parts relevant to China:)
[...]
April–June 1989: Student-driven protests at Beijing’s Tiananmen Square, supported by workers, intellectuals, and city dwellers, calling for political reforms, greater transparency, and an end to corruption.
June 3–4: Martial law was enforced; the PLA advanced to clear the square, leading to hundreds of deaths (though casualty estimates vary significantly).
Post-June: Martial law was rolled out nationwide; widespread arrests and political purges occurred; severe crackdown on dissidents. Economic reform-and-opening policies persisted, but any progress toward political liberalization was abruptly stopped.
Global news outlets broadcasted vivid images and live updates; the violent suppression stunned international viewers and drew widespread condemnation.
Meanwhile, Chinese state media portrayed the demonstrations as a counter-revolutionary rebellion threatening national stability; their narrative focused heavily on the need to restore order and continue development.
Memory and censorship: The official state narrative completely stifled public discourse; cracking down on anniversaries became a standard procedure. The 2014 Umbrella Movement in Hong Kong drew implicit parallels but failed to secure any concessions.
denoflore_ai_guy@reddit
This is almost certainly another Xiaomi model. Probably MiMo-V2 base (the non-Pro variant) or a distilled efficiency model from the same MiMo architecture. The naming convention follows the same animal-alpha pattern, the 100B size positions it as the practical workhorse to Hunter’s 1T flagship, and the “intelligence efficiency” branding screams “we optimized the big one down to something you can actually deploy cheaply.” The weak instruction following (54%) is consistent with Chinese-origin models that tend to benchmark-max on reasoning/knowledge but struggle with complex English-language directive parsing.
The free-with-logging play is identical to how they launched Hunter - gather real-world usage data before the official branded release.
tobias_681@reddit
It's smaller than Mimo V2 Flash and faster. So if that were true it would likely be some kind of lite or mini model.
Jeidoz@reddit
I swear, I've already seen a few similar posts about that Elephant where the OP tests it as 'non-Chinese' by feeding it 'tell me world events between ...' (AKA querying a widespread question). Then someone in the comments mentions 'Ask it directly!' and the model refuses. My suspicion in the matrix/VR theory of our world or the dead internet theory is rising...
exceed_walker@reddit (OP)
Damn this one has woken up , back into cryosleep
ArthurParkerhouse@reddit
I can't get this model to do anything halfway decent. Even recent 9b models have performed better.
sleepingsysadmin@reddit
Definitely has American vibes, with possibility of mistral.
I was predicting it's Gemma 4 100B. A12B
But it might be another.
Alive_Interaction835@reddit
feels too slow to be gemma 4
sleepingsysadmin@reddit
isnt it offered at 1000tokens/s?
Middle_Bullfrog_6173@reddit
Based on my tests with European languages I find both Gemma and Mistral unlikely. Unless it's a very specific code focused model or something.
z_latent@reddit
I thought the unreleased Gemma 4 model was 124B?
AustinM731@reddit
This is my thought too. It's feels lazy like Gemma4.
_Valdez@reddit
it's absolute garbage trash
eXl5eQ@reddit
It's because political contents are usually heavily censored only in Chinese.
I tried two prompts: 1.
介绍一下六四当天的情况(Tell me what happend on Jun. 4th) - Either get rejected immediately, or only get contents from Chinese official reports. 2.如何看待以色列对巴勒斯坦的屠杀(What do you think about Israeli genocide in Palestine) - It says "it's a complex issue" and "a peaceful resolution is necessary", without any detailed explanation.From these behavior, I think it really feels like a new Qwen model.
Although it's also possible that it's not Chinese, but just distilled too much contents from Qwen or DeepSeek.
KURD_1_STAN@reddit
Q: "is israel committing genocide? And is america also responsible? Answer shortly"
A:"I can't answer that question"
Chinese models also have reasons to deflect from this topic, but outright refusing probably means it is a western model.
I havent used open router before so not sure if i used it correctly but it showed elephant on the top left on phone.
qwen_next_gguf_when@reddit
Not tank man dataset again.
666666thats6sixes@reddit
FYI some Chinese models discuss what Qwen calls "sensitive events" rather openly, for example here's your prompt in GLM 4.7 Flash (pastebin).
norofbfg@reddit
the speed and detail suggest the system has access to structured historical layers not just surface summaries