Chain-of-Thought reasoning on the next token prediction level?

Posted by Marha01@reddit | LocalLLaMA | View on Reddit | 2 comments

A single LLM pass ultimately outputs a single next token to be added to the prompt. It seems to me that the most logical CoT approach would be a reasoning system where we output the proposed next token, then run the LLM again, asking it to evaluate and justify the proposed choice, then proceed to either confirm the token choice and definitely add it to the prompt or produce an alternative "best next token" proposal (list of proposals?) and repeat the decision process again until the best next token choice is definitely confirmed.

Has anyone tried such a "token level" CoT reasoning approach?