Decent model to "quickly" recognize rule violations?

Posted by xephadoodle@reddit | LocalLLaMA | View on Reddit | 7 comments

Hello all, I am building an AI agent orchestrator of sorts, and am wanting to be able to add in a local model that could quickly recognize whether the ai agents are breaking basic rules, like trying to stash files to avoid fixing tests, or mentioning anything about "simplifying" the code or tests (always a bad sign the agent is going the lazy route), etc.

I have a 24gb nvidia on hand, but I am unsure which models could be given some basic rule context and do reliable/quick flagging of violations.

Thanks in advance, and sorry if this might be a dumb/impossible question.