Smells like a handful of AIML employees got shit canned and wanted some of that easy VC AI race money with a fake benchmark puff-up exit scam.
This is the only website I need to see
[https://arxiv.org/search/?query=Arx-0.3&searchtype=all&source=header](https://arxiv.org/search/?query=Arx-0.3&searchtype=all&source=header)
I can't tell you how many AI-sus projects I see out there with young people scamming VCs. Halfway through a new and interesting GitHub project red flags start going up because none of the code works isn't linted for shit and there were a bunch of fake ass new contributors who have botlike Behavior submitting a bunch of stupid RPs with minor changes to generate traffic.
You start digging into some of the authors and it's some kid with a couple repos but instantly popped up at the exact same time with the bunch of obvious Auto generated BS code and BS readme's and dizzying landfill of words that probably would impress someone that hasn't worked in it.
Lol. You coming to this 3 days later, on an account with only two comments, and one of your other comments is some wallstreebets jank. Way to reinforce what I'm saying, you obviously are associated to the project.
Just a scientist trying to learn more and i don't use reddit much. Instead of attacking me and scrolling through my comments, why don't you provide your reasoning or evidence for your claims?
Wish there was more detail. They are an AI Search company like Perplexity, so they may have been using RAG to answer the questions rather than just the model itself.
I think various forms of storing the information in precise databases, but in easy to retrieve and understand form, is better than storing it in neural network weights, and is the future.
The neural network I think should have as good as possible general understanding of the world, of processes, phenomenons, associations and relationships, but not facts. It might still be useful for them to remember some facts, but always check them from the precise databases that they are tightly combined with.
Evolution of biological organisms couldn't create such symbiosis, couldn't create precise forms of learned data storage (keyword is learned, during the organism's life time). We can.
I’m a bit intrigued.
From their CTO (Thomas Baker) at LinkedIn “When we say AGI, we’re taking about a highly opinionated approach that looks beyond LLMs. It means developing these incredible aspects of Ai without needing massive data centers and Nuclear Power Plants to do it! I’ll be excited to share some incredible updates with you all in the coming months.“
Applied General Intelligence is apparently the company behind the model.
> We recently submitted Arx-0.3 to MMLU-Pro, the latest and most challenging Massive Multitask Language Understanding benchmark to validate our research assumptions and assess our technical approach. This submission will help us track progress toward developing general intelligence capable of understanding, reasoning, and explaining beyond patterns.
> Arx-0.3 operates with coherence-based comprehension via universal language understanding. The system is designed to solve multi-step problems and perform deliberate reasoning across domains. MMLU-Pro's focus on these same capabilities, and alignment with practical applications, makes it ideal to validate our assumptions and direction
Based in Austin, Texas.
[Website](https://www.agi.live/) says, "A path beyond LLMs to a new paradigm for intelligence".
Employees include:
- Kurt Bonatz (Co-founder/CEO)
- "Jerry" Xiaolin Zhang (Co-founder/Chief Science Officer)
- Robert Montoya (Software Engineering Leader)
- Thomas Baker (Chief Technology Officer)
- Dapeng Tong (Software Developer)
Their CEO promises full explainability and zero hallucinations. He says in a pitch their model isn't a "black box," so it doesn't sound like a standard neural network approach.
[A Google Groups user with the name Xiaolin Zhang, signing his name as *Jerry* Zhang, asked a series of questions about NELL in 2016](https://groups.google.com/g/cmunell/c/wTFyFU_rafk). NELL (Never-Ending Language Learning) is a semantic machine learning system. Apparently, Jerry was "working toward an entry for IBM's Watson AI XPRIZE Competition".
I don't know if this is the same "Jerry" Xiaolin Zhang, but it would be quite the coincidence if not.
So ... LLM + knowledge graph?
You're likely confusing it with another benchmark like GPQA . [MMLU-Pro](https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro) is public, and to my knowledge it was never considered secret. The main point of the Pro edition was just to clean up the mistakes in the original benchmark and to be a bit harder..
Where did this top-scoring model on [MMLU-Pro](https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro) come from, who makes it and why haven't I heard of it?
They seem to be a relatively small British company.
This guy might be their secret sauce
https://www.researchgate.net/scientific-contributions/Simon-M-Stringer-2163805127
From 2016.
https://preview.redd.it/ggbbcf0jt0md1.png?width=885&format=png&auto=webp&s=b25c93c7e3817320302b1d68f7a5c19f104f026e
[https://nautil.us/westworld-is-strikingly-real-ai-could-be-conscious-and-unpredictable-236291/](https://nautil.us/westworld-is-strikingly-real-ai-could-be-conscious-and-unpredictable-236291/)
41 Comments
AccountantDry2483@reddit
FarVision5@reddit
Warm_Iron_273@reddit
leadfaarmr@reddit
Warm_Iron_273@reddit
leadfaarmr@reddit
Striking_Most_5111@reddit
Airbus_Tom@reddit
Warm_Iron_273@reddit
Airbus_Tom@reddit
_supert_@reddit
Airbus_Tom@reddit
bulletsandchaos@reddit
bulletsandchaos@reddit
CeFurkan@reddit
Formal-Narwhal-1610@reddit
Pojiku@reddit
Dayder111@reddit
UnchainedAlgo@reddit
VeryRealHuman23@reddit
AnticitizenPrime@reddit
vert1s@reddit
DesignToWin@reddit
AbheekG@reddit
Hemingbird@reddit
Homeschooled316@reddit
Crazyscientist1024@reddit
askchris@reddit
ksym_@reddit
mikael110@reddit
ihaag@reddit
askchris@reddit
Healthy-Nebula-3603@reddit
Balance-@reddit (OP)
rorowhat@reddit
ambient_temp_xeno@reddit
mjolk@reddit
ambient_temp_xeno@reddit
enigma707@reddit
ambient_temp_xeno@reddit
Additional_Test_758@reddit