Where did Arx-0.3 come from and who makes it?

[-]

AccountantDry2483@reddit

They just posted this. Woah https://x.com/appliedgeneral/status/1884738566645018932?s=46

Reply

[-]

Smells like a handful of AIML employees got shit canned and wanted some of that easy VC AI race money with a fake benchmark puff-up exit scam. This is the only website I need to see [https://arxiv.org/search/?query=Arx-0.3&searchtype=all&source=header](https://arxiv.org/search/?query=Arx-0.3&searchtype=all&source=header) I can't tell you how many AI-sus projects I see out there with young people scamming VCs. Halfway through a new and interesting GitHub project red flags start going up because none of the code works isn't linted for shit and there were a bunch of fake ass new contributors who have botlike Behavior submitting a bunch of stupid RPs with minor changes to generate traffic. You start digging into some of the authors and it's some kid with a couple repos but instantly popped up at the exact same time with the bunch of obvious Auto generated BS code and BS readme's and dizzying landfill of words that probably would impress someone that hasn't worked in it.

Reply

[-]

Warm_Iron_273@reddit

Some janky scam that means nothing because the benchmark question set is public.

Reply

[-]

leadfaarmr@reddit

Source? Evidence? Links? Proof?

Reply

[-]

Warm_Iron_273@reddit

Lol. You coming to this 3 days later, on an account with only two comments, and one of your other comments is some wallstreebets jank. Way to reinforce what I'm saying, you obviously are associated to the project.

Reply

[-]

leadfaarmr@reddit

Just a scientist trying to learn more and i don't use reddit much. Instead of attacking me and scrolling through my comments, why don't you provide your reasoning or evidence for your claims?

Reply

[-]

Striking_Most_5111@reddit

Does anyone know how to use it?

Reply

[-]

Airbus_Tom@reddit

by this org (never heard before): [ARX (agi-v2.webflow.io)](https://agi-v2.webflow.io/arx)

Reply

[-]

Warm_Iron_273@reddit

Yeah, so literally a fundraising scam.

Reply

[-]

Airbus_Tom@reddit

I hate when those orgs do not provide more info about their model.

Reply

[-]

_supert_@reddit

Cracking website.

Reply

[-]

Airbus_Tom@reddit

no useful info on the website

Reply

[-]

bulletsandchaos@reddit

It really reads like a VC pitch “A path beyond LLMs to a new paradigm for intelligence.”

Reply

[-]

bulletsandchaos@reddit

Their actual URL is agi.live - the deployment of their website is janky.

Reply

[-]

CeFurkan@reddit

Until i test and compare myself i don't trust these benchmarks not a bit. Currently king is claude 3.5 sonnet

Reply

[-]

Formal-Narwhal-1610@reddit

iAsk.ai claims 86 percent on MMLU Pro, https://iask.ai/mmlu-pro

Reply

[-]

Pojiku@reddit

Wish there was more detail. They are an AI Search company like Perplexity, so they may have been using RAG to answer the questions rather than just the model itself.

Reply

[-]

Dayder111@reddit

I think various forms of storing the information in precise databases, but in easy to retrieve and understand form, is better than storing it in neural network weights, and is the future. The neural network I think should have as good as possible general understanding of the world, of processes, phenomenons, associations and relationships, but not facts. It might still be useful for them to remember some facts, but always check them from the precise databases that they are tightly combined with. Evolution of biological organisms couldn't create such symbiosis, couldn't create precise forms of learned data storage (keyword is learned, during the organism's life time). We can.

Reply

[-]

UnchainedAlgo@reddit

I’m a bit intrigued. From their CTO (Thomas Baker) at LinkedIn “When we say AGI, we’re taking about a highly opinionated approach that looks beyond LLMs. It means developing these incredible aspects of Ai without needing massive data centers and Nuclear Power Plants to do it! I’ll be excited to share some incredible updates with you all in the coming months.“

Reply

[-]

VeryRealHuman23@reddit

this reads like it was written by AI or a marketer who has no idea what they are doing.

Reply

[-]

AnticitizenPrime@reddit

AIs wouldn't randomly capitalize 'nuclear power plants'. :)

Reply

[-]

vert1s@reddit

That's what you think, I have a prompt setup to trick you by telling it to use bad grammar and capitalise badly

Reply

[-]

DesignToWin@reddit

Speech to text, right? Voice keyboards sometimes randomly Capitalize stuff.

Reply

[-]

AbheekG@reddit

Honestly not really

Reply

[-]

Hemingbird@reddit

Applied General Intelligence is apparently the company behind the model. > We recently submitted Arx-0.3 to MMLU-Pro, the latest and most challenging Massive Multitask Language Understanding benchmark to validate our research assumptions and assess our technical approach. This submission will help us track progress toward developing general intelligence capable of understanding, reasoning, and explaining beyond patterns. > Arx-0.3 operates with coherence-based comprehension via universal language understanding. The system is designed to solve multi-step problems and perform deliberate reasoning across domains. MMLU-Pro's focus on these same capabilities, and alignment with practical applications, makes it ideal to validate our assumptions and direction Based in Austin, Texas. [Website](https://www.agi.live/) says, "A path beyond LLMs to a new paradigm for intelligence". Employees include: - Kurt Bonatz (Co-founder/CEO) - "Jerry" Xiaolin Zhang (Co-founder/Chief Science Officer) - Robert Montoya (Software Engineering Leader) - Thomas Baker (Chief Technology Officer) - Dapeng Tong (Software Developer) Their CEO promises full explainability and zero hallucinations. He says in a pitch their model isn't a "black box," so it doesn't sound like a standard neural network approach. [A Google Groups user with the name Xiaolin Zhang, signing his name as *Jerry* Zhang, asked a series of questions about NELL in 2016](https://groups.google.com/g/cmunell/c/wTFyFU_rafk). NELL (Never-Ending Language Learning) is a semantic machine learning system. Apparently, Jerry was "working toward an entry for IBM's Watson AI XPRIZE Competition". I don't know if this is the same "Jerry" Xiaolin Zhang, but it would be quite the coincidence if not. So ... LLM + knowledge graph?

Reply

[-]

Homeschooled316@reddit

> paradigm [Aren't these just buzzwords that dumb people use to sound important? I'm fired, aren't I?](https://www.youtube.com/watch?v=ea5L2hQurWA)

Reply

[-]

Crazyscientist1024@reddit

True if huge or maybe just training on test set is all you need

Reply

[-]

askchris@reddit

The questions and answers to MMLU pro are public, so it's easy to get 90%-100% with a small model trained on the answers.

Reply

[-]

ksym_@reddit

Wasn't MMLU **pro** benchmark the one where the questions are actually held out from the public? Did they end up publishing it?

Reply

[-]

mikael110@reddit

You're likely confusing it with another benchmark like GPQA . [MMLU-Pro](https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro) is public, and to my knowledge it was never considered secret. The main point of the Pro edition was just to clean up the mistakes in the original benchmark and to be a bit harder..

Reply

[-]

ihaag@reddit

Qwen2 better than deepseekV2 I don’t think so!

Reply

[-]

askchris@reddit

This is the MMLU pro benchmark, a well rounded benchmark that Qwen 2 excels in, not a coding challenge which deepseek V2 is fine-tuned to excel in.

Reply

[-]

Healthy-Nebula-3603@reddit

Qwen 2 72b is very good and old for today's standard ... probably soon introduce V3.

Reply

[-]

Balance-@reddit (OP)

Where did this top-scoring model on [MMLU-Pro](https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro) come from, who makes it and why haven't I heard of it?

Reply

[-]

rorowhat@reddit

Have you tried it? curious to know if anyone has experience with it.

Reply

[-]

ambient_temp_xeno@reddit

They seem to be a relatively small British company. This guy might be their secret sauce https://www.researchgate.net/scientific-contributions/Simon-M-Stringer-2163805127

Reply

[-]

mjolk@reddit

Nice find! Where did you find the company/staff profile?

Reply

[-]

ambient_temp_xeno@reddit

https://find-and-update.company-information.service.gov.uk/company/12211733

Reply

[-]

enigma707@reddit

It’s seems like the brains of the operation just recently resigned from that company.

Reply

[-]

ambient_temp_xeno@reddit

From 2016. https://preview.redd.it/ggbbcf0jt0md1.png?width=885&format=png&auto=webp&s=b25c93c7e3817320302b1d68f7a5c19f104f026e [https://nautil.us/westworld-is-strikingly-real-ai-could-be-conscious-and-unpredictable-236291/](https://nautil.us/westworld-is-strikingly-real-ai-could-be-conscious-and-unpredictable-236291/)

Reply to Post

41 Comments