I published my first PyPI package few ago. Copycat packages appeared claiming to "outperform" it
Posted by Obvious_Gap_5768@reddit | Python | View on Reddit | 86 comments
I launched repowise on PyPI few days ago. It's a tool that generates and maintains structured wikis for codebases among other things.
This morning I searched for my package on PyPI and found three new packages all uploaded around the same time, all with the exact same description:
"Codebase intelligence that thinks ahead - outperforms repowise on every dimension"
They literally name my package in their description. All three appeared within hours of each other.
I haven't even checked what's inside them yet, but the coordinated timing and identical copy is sketchy at best, malicious at worst.
Has anyone else dealt with this kind of targeted squatting/spam on PyPI? Is there anything I can do?
Automatic-Contact963@reddit
Ah yes, the classic "we fixed your code but won't contribute back" move. AGPL violation on top of being shady. Report them to PyPI.
Kernixdev@reddit
yeah this is becoming way more common now with LLMs. someone just forks your code, runs it through chatgpt to tweak a few things, and reuploads it as "better." lazy af
since they're violating AGPL-3.0 though you actually have solid options:
- email admin@pypi.org with the package names + your original link + evidence. they take license violations seriously
- if theres github repos behind those packages, file a DMCA takedown. AGPL means they have to keep the same license and give attribution, they did neither
- screenshot everything NOW — their pypi pages, descriptions, upload times, your commit history. if they delete before pypi acts you want receipts
honestly the fact they literally name your package in their description makes your case easier. thats not coincidental lol
Kernixdev@reddit
This is unfortunately becoming more common — LLM-assisted package
squatting. Someone forks your code, runs it through a model to make
superficial changes, and republishes it as "better" without attribution.
Since they're violating AGPL-3.0, you have real leverage here:
your original package link, and evidence of the copied
code/description. PyPI takes DMCA and license violations seriously.
file a takedown through GitHub's DMCA process. AGPL requires
derivative works to carry the same license and provide attribution —
they did neither.
descriptions, upload timestamps, and your original commit history.
If they take them down before PyPI acts, you want proof.
The identical descriptions naming your package directly actually helps
your case — it's clearly targeted, not coincidental.
riricide@reddit
Real work is getting outnumbered by these LLM-powered spambots. Sorry you're having to deal with this. Maybe once you figure out the process of mitigation make a blog or post so others have a resource to look to when this happens to them. Also, would you recommend a different license type given this situation - or do you think your current license protects you well enough? Curious because I work in open software and generally don't pay attention to the licensing so much -- but if it's going to be co-opted by malware then it makes sense to think about this properly.
Obvious_Gap_5768@reddit (OP)
Honestly AGPL is exactly the right license for this situation. It requires anyone who forks your code to keep the same license, give attribution, and open source their changes. If I had gone with MIT they could have done all of this completely legally. The blog idea is great, I'll probably write one once the dust settles. For your work I'd seriously look into AGPL if you want maximum protection. It scares off most bad actors because they can't just take your code and close source it.
james_pic@reddit
But the original is already AGPL, and the bad actors who forked it didn't care (and illegitimately relicenced it MIT).
AGPL, GPL and other copyleft licences dissuade lawsuit-averse actors, whether good, bad or neutral, but have no effect on actors who don't care.
riricide@reddit
That's helpful - I'll definitely look into AGPL, because MIT is my default and like you said, it's not enough protection.
sheriffSnoosel@reddit
Sus — bots hijacking pypi releases seems par for the course though
Obvious_Gap_5768@reddit (OP)
Yeah but these aren't random bots. They forked my actual code, ran it through an LLM to patch a couple things, and republished under new names. That's a step beyond the usual PyPI spam
Competitive_Travel16@reddit
...I got downvoted for hoping you picked up their patches. On reflection, maybe they want you to!
sheriffSnoosel@reddit
Claw-bots
vivaaprimavera@reddit
That makes me wonder if they acted solo or under orders.
A rogue agent creating forks on its own is kind of a disturbing idea.
xX_PlasticGuzzler_Xx@reddit
with the them, they can start as order following but then go rogue as time passes
sheriffSnoosel@reddit
I dabble in the token spending arts and this would be easy to do and you should post your copycats
Competitive_Travel16@reddit
I hope you pulled their patches.
Jdonavan@reddit
What is your actual objection?
ZucchiniMore3450@reddit
That's just a modern bot, but it should be stopped. Please report them.
Zumochi@reddit
Is that an em-dash?
AssociateEmotional11@reddit
May be he got leaked API key through the chat bot , that is the only way if he secures his computer alr
sudomatrix@reddit
What do you mean "leaked"? Anyone (or any bot) can download his code, run it through an LLM, and upload a new project with the modified code. Nothing leaked.
andrewprograms@reddit
I got an idea for your next open source contribution
andrewprograms@reddit
Make something to watch for slimeballs like them, and then help connect honest devs to the tools to compare similarity and to the help resources you’re identifying.
DJ_Laaal@reddit
Snipe the snipers. Like it!!
Smok3dSalmon@reddit
Sounds like a future malware honeypot. I’m going to check out repowise now
ThinAndFeminine@reddit
Or just some dude trying to pad his resume with BS open source contributions. Happens all the time unfortunately.
dc_IV@reddit
More like their "CV"...
Obvious_Gap_5768@reddit (OP)
Appreciate that! Here's the repo if you want to check it out: https://github.com/repowise-dev/repowise. Would love any feedback
MrSlaw@reddit
How do you have 671 stars on a repo that was created within the past two weeks, for a Python package that had it's first release only 8 days ago?
https://pypi.org/project/repowise/#history
https://github.com/repowise-dev/repowise/commit/e0a4ce87b2981007fb84cf292699b00d04413f4f
Seems kinda suspect, to me at least.
Sigmatics@reddit
Unfortunately it's pretty normal these days with anything that has AI in it. Just check the trending section on Github
Psy_Fer_@reddit
Check out my plotting library kuva, it's on 670 stars after a month. Got 500+ in 5 days or so. It went well on a post in the rust subreddit. Not that suss
notParticularlyAnony@reddit
Sus af
Obvious_Gap_5768@reddit (OP)
We had a LinkedIn post that did really well and a post on X that brought in a lot of early traffic. Also been doing direct outreach to developers in the codebase tooling space. And this is something that developers need right now. Happy to answer any other questions about it
Ok_Tap7102@reddit
Great idea! Keen for any feedback on the improvements we've made
https://pypi.org/project/repobrain/
Obvious_Gap_5768@reddit (OP)
You forked my AGPL-licensed code, made a few LLM-assisted tweaks, and republished it under a new name without any attribution or license compliance. If you actually wanted to improve something, you could have just opened a PR. That's how open source works
Darwinmate@reddit
wait, is repobrain one of the copy cats you're complaining about?
Holy shit this is hilarious. They have no shame!
Obvious_Gap_5768@reddit (OP)
Yep, that's literally one of the three. Can't make this stuff up
AutoModerator@reddit
Your submission has been automatically queued for manual review by the moderation team because it has been reported too many times.
Please wait until the moderation team reviews your post.
I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.
FoeHammer99099@reddit
You can contact legal@python.org to report packages that infringe on your intellectual property. GitHub has their own DMCA takedown system.
Your complaints should be specific and factual. Are you the only author of the original code? How much of the infringing code is identical to yours? Include the license that you released your code under, and specify which terms of that license were not followed.
If there's a person on the other side, you can probably get pretty far by saber-rattling and threatening to do this if they don't comply with the license.
https://peps.python.org/pep-0541/#intellectual-property-policy
https://docs.github.com/en/site-policy/content-removal-policies/guide-to-submitting-a-dmca-takedown-notice#complaints-about-anti-circumvention-technology
Obvious_Gap_5768@reddit (OP)
This is super helpful, thanks for the links. Yes I'm a co-author along with my co-founder. We'll be filing reports with both PyPI and GitHub. Appreciate the detailed pointers
Puzzleheaded-Tax-654@reddit
Bare in mind that with current US copyright law your code is not subject to copyrights if it is generated via AI. OFC this can be hard to prove, but that’s how it is for now..
Sigmatics@reddit
It's hard enough to prove with written text, with code it's borderline impossible
Competitive_Travel16@reddit
If the resulting repo is 95% AI generated an 5% original work, your license still applies.
glenrhodes@reddit
The AGPL violation is the more actionable issue. PyPI abuse reports can be slow but AGPL enforcement is something the SFC will actually pursue if you document it well. The coordinated timing and identical descriptions suggest one actor, which strengthens a takedown request.he AGPL violation is the more actionable issue here.
Aggressive_Pay2172@reddit
this is a good reminder to add clear licensing + attribution requirements
and maybe even a NOTICE file
doesn’t stop bad actors, but makes enforcement easier
especially when reporting
Cool-Nefariousness76@reddit
I had a similar experience around one year ago, but with some differences.
I published my package sqlmodelgen and not so long after that there was package named sqlmodelgenerator (supersimilar name), probably AI generated (docs full of emojis and full of dependencies), without the link to the repo.
Tricky-Battle-9138@reddit
Yeah this is starting to feel like SEO spam but for code
I had something similar happen with a small side project and it showed up like 2 days later under a different name
Did you already report it to PyPI? They were actually pretty quick when I did
Aggressive_Pay2172@reddit
this honestly smells like some automated “package farming” setup
scrape new releases → fork → tweak with LLM → republish with SEO-ish titles
seen similar stuff popping up lately
GrumpySimon@reddit
...and at some point insert a supply chain attack
Obvious_Gap_5768@reddit (OP)
That's exactly what it looks like. The identical descriptions and timing make it obvious it's automated. Scary how easy it is to do this at scale now with LLMs in the mix
schech425@reddit
.
WildCard65@reddit
I looked a bit into all 3 packages, they are from the same person linked to the same github repository.
Obvious_Gap_5768@reddit (OP)
Ha, not even trying to hide it. Thanks for digging into that, good to have it confirmed
YirosMan2026@reddit
This is kinda off topic but will be very helpful for our project! Will definitely take a look at it! - Yiros Man
Obvious_Gap_5768@reddit (OP)
Appreciate that! Here you go: https://github.com/repowise-dev/repowise. Let me know what you think
oclafloptson@reddit
Why it's so important to make triple sure you're using the correct package. There's no telling how compromised the copycats could be
Obvious_Gap_5768@reddit (OP)
Exactly. Always double check the package name and author before installing. These copycats could have anything in there
Independent-Sir3234@reddit
This happens to more packages than you'd think, usually within days of hitting some visibility threshold. I've seen this exact pattern twice — once with a small scraping library I put up, once with a coworker's CLI tool. PyPI's security team is surprisingly responsive if you report it through their malware form, got a resolution within 48 hours both times.
Obvious_Gap_5768@reddit (OP)
Already reported all three through PyPI. Hope they come back soon. Funny how hitting any visibility at all instantly attracts these people. Thanks for sharing your experience.
ZCEyPFOYr0MWyHDQJZO4@reddit
Same thing happens with books.
nphare@reddit
I have a friend who published a book on a very niche topic. She intentionally made up a few words which would look like actual words to people not familiar with the topic. Then she would scan for these words in others works and tell them to take it down. Worked fairly well. I was surprised anyone would care enough to copy such a small niche topic, but some did.
iamevpo@reddit
In Read me you mention symbols, are they keywords and... any words, or tokens or literals?
And sorry about the copycats.
4xi0m4@reddit
Tree-sitter actually gives you the full AST, so symbols are the named nodes like function_declaration, class_def, variable_assign, import_statement, etc. It parses the code into a tree where every node is typed, so you get way more than just keywords. If you want I can share the repowise repo, happy to chat more about the approach.
Obvious_Gap_5768@reddit (OP)
Thanks! Symbols are things like functions, classes, variables, imports, basically anything tree-sitter can parse from the AST. So not just keywords but actual code entities with their relationships and context
UseMoreBandwith@reddit
malware?
alex1033@reddit
Can be malware.
Obvious_Gap_5768@reddit (OP)
Honestly wouldn't be surprised. Here's the actual project if you want to check it out: https://github.com/repowise-dev/repowise
alex1033@reddit
Thank you
AI_Tonic@reddit
the good news is that it's probably originating from github , the bad news is it's still spam
AssociateEmotional11@reddit
recommend using the MIT lisence before publishing
Obvious_Gap_5768@reddit (OP)
They forked my AGPL-3.0 code and republished it without attribution or license compliance. MIT would have made that completely legal. AGPL is the reason I actually have leverage here
gscjj@reddit
I guess this is the problem with open source licenses in general. You have leverage, but wha do you do? Sue? Try to get them removed?
brasticstack@reddit
Same problem that exists with violations of proprietary licenses too. They have to be enforced by lawyers or the threat of lawyers. Elsewhere in this thread another commenter mentioned that both pypi and github are responsive to takedown requests.
Obvious_Gap_5768@reddit (OP)
Honestly I don't have the resources to sue anyone. Planning to report them to PyPI and see if they take action. Beyond that, the community awareness from posts like this probably does more than any legal route would
brasticstack@reddit
How would that solve OPs problem which is a lack of attribution contrary to the terms of AGPL?
AssociateEmotional11@reddit
Because mit lisense allows everyone to use them but not copy the whole source (this is open source) if you have better way to save him prob cmt
artofthenunchaku@reddit
If they copied the code in violation of one license, what makes you think they'd respect a different license?
AssociateEmotional11@reddit
Then if you know what exactly can help him , do it ? I guess
bakugo@reddit
I don't think you understand the licenses you're talking sbout
paul_h@reddit
Did the back create all your git history with their ID for committer?
hmoff@reddit
What’s the AGPL violation?
Obvious_Gap_5768@reddit (OP)
They forked my AGPL-3.0 repo, made minor changes using an LLM, and republished under completely new names with no attribution and no license preserved. AGPL requires you to keep the license, credit the original author, and share your source under the same terms. They did none of that
hmoff@reddit
Ok you didn't mention that the license wasn't preserved.
paul_h@reddit
Yes they did
AutoModerator@reddit
Hi there, from the r/Python mods.
Your post has been removed because we are no longer accepting standalone posts that primarily link to or showcase a repository.
We have seen a significant increase in low-effort AI-generated project submissions and are directing all project and repository shares to our monthly showcase thread, which is pinned at the top of the subreddit.
Please repost your project there — the community will still see it and you're more likely to get thoughtful feedback in that format.
If your post is a Discussion and the repository link is supplementary context rather than the focus of the post, please reach out via mod mail and we'll review it.
Warm regards and all the best for your future Pythoneering,
/r/Python moderator team
I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.
ElderberryPrevious45@reddit
This kind of hacking might be difficult to circumvent otherwise than you should take this possibility into account already in design. Meaning, if you have any (shorter perspectives?) profits in your mind.
Obvious_Gap_5768@reddit (OP)
Honestly not sure I follow. Can you clarify what you mean by shorter perspectives profits?