Is there any <3B model with usable 200k+ context window?

Posted by madmax_br5@reddit | LocalLLaMA | View on Reddit | 15 comments

I need a small model for processing conversation transcripts from larger models, so need usable context window out to at least 200k tokens. I know some models claim to support this, but I don’t know which are actually good at this in practice.
Also desirable: low hallucination rate, not super verbose.