New model for detecting and masking PII from OpenAI
Posted by doesitoffendyou@reddit | LocalLLaMA | View on Reddit | 8 comments
Posted by doesitoffendyou@reddit | LocalLLaMA | View on Reddit | 8 comments
xAragon_@reddit
Old news, there were already several posts on this.
- https://www.reddit.com/r/LocalLLaMA/comments/1ssp4kb/openai_privacy_filter_model/
- https://www.reddit.com/r/LocalLLaMA/comments/1stjl04/openai_privacy_filter_goes_openweight_apache_20/
- https://www.reddit.com/r/LocalLLaMA/comments/1ssps99/new_openai_privacy_filter_model_running_locally/
Dry_Researcher_1676@reddit
ez advertisement
reddysteady@reddit
Stack overflow lives on
Daemontatox@reddit
Have people never heard of PII models? Like hello? Why would i ever use this over any of the other ultra light and ultra fast models ?
Also this seems to be English only and behave really really bad on other languages.
SkyFeistyLlama8@reddit
OpenAI knows all about how to mask PII because they've been hoovering up people's PII for years.
LegacyRemaster@reddit
They released it a few days ago. They say, "If you want to use your stuff online, you'd better delete sensitive data because who knows what will be done with it." It's basically a manifesto for open source and local LLMs.
doesitoffendyou@reddit (OP)
It's a MoE with 1.5b parameters, 50 million activated, Apache 2.0 license.
"Privacy Filter is designed for practical privacy filtering in noisy, real-world text. That includes long documents, ambiguous references, mixed-format strings, and software-related secrets." Model card heer
ResidentPositive4122@reddit
And, IIUC this should be fast AF. It doesn't generate tokens but it classifies them in one pass, and gives you a set of detections and scores.