Making AI fight with itself to figure out how to talk better and smarter

Posted by AmadiohAni@reddit | CrazyIdeas | View on Reddit | 14 comments

In chess, AI fight against each other to figure out what the best next move is. The problem with AI like chatgpt is that one day, it will run out of data to collect and most things it does collect will be a made by AI so it will always come out worse. But if we had chatgpt fight against Claude AI in who a better communicator or educator is, we can get the machine learning algorithms to overpass their limitations of data

[-]

CastorCurio@reddit

Have you seen the state of Reddit? If this isn't a bunch of actual bots making these ridiculous, pointless, arguments then I've lost all faith in humanity.

There are actually subs of just bits talking with each other btw. Very funny.

[-]

SwedishCandyStore@reddit

Why would you want this? Chess isn't fun it's warfare

[-]

frank26080115@reddit

That's what the Adverserial in GAN https://en.wikipedia.org/wiki/Generative_adversarial_network does

The core idea of a GAN is based on the "indirect" training through the discriminator, another neural network that can tell how "realistic" the input seems, which itself is also being updated dynamically.[5] This means that the generator is not trained to minimize the distance to a specific image, but rather to fool the discriminator. This enables the model to learn in an unsupervised manner.

[-]

xRVAx@reddit

Would you end up with text-based versions of "is it a Chihuahua or is it a blueberry muffin?"

[-]

Phoenixness@reddit

Good news, Generative Adversarial Networks or GANs were more or less the foundations of early AI image generation that work by having a system designed to say what's in an image and a system designed to make images and make them fight. Getting AI to fight is still a way we use to make AI better.

[-]

Flam1ng1cecream@reddit

This is a good idea, but the difficult part is figuring out what counts as "better". AFAIK, the state-of-the-art method would be Reinforcement Learning from Human Feedback, or RLHF. You have an AI that's predicting what grade a human would give the AIs you actually want to train, and every so often, a human gives an actual grade. So the judge AI learns what the human wants, and the LLMs learn to communicate.

[-]

AmadiohAni@reddit (OP)

Well j would assume a human would rate it on its hospitality and it needs to win over the other ai and a fact checker would view it on its credibility

[-]

Hazza_time@reddit

Ai already does this. ChatGPT sometimes generates 2 responses and asks you which is better

[-]

Flam1ng1cecream@reddit

Yeah, exactly! I don't know what specific architecture they're using and whether it's technically the same as RLHF though, so that's why I didn't mention it

[-]

AmadiohAni@reddit (OP)

So it's competing against other ai to win over the people its being shown to

[-]

barking420@reddit

this is just a reddit comment section haha

[-]

Prestigous_Owl@reddit

Honestly, it's more likely to be the opposite.

AI talking to AI, so far, produces essentially cognitive decline. AI gives AI dementia

[-]

AmadiohAni@reddit (OP)

Fight with, not talk to

[-]

Prestigous_Owl@reddit

Okay well then you aren't making sense.

The point is that when AI interacts with other AI, decline ensues.

If your point is more just "having them compete", with a human observer/evaluator etc, but keeping them siloed from each other. then this idea is essentially nothing. At best, you could tell which wins, but that wouldn't improve anything just tell you which in current state is better. If it's iterative, then your idea is just "people should provide AI with mroe material",