protatoquests: Proxy Rotation Requests
Posted by nicoloboschi@reddit | Python | View on Reddit | 5 comments
I wanted to showcase my newest Python library that I have been using for some months now to perform anonymous webscraping.
Repo: https://github.com/nicoloboschi/protatoquests
What My Project Does
Helps with webscraping by rotating proxies to not get blocked by ip-blocking from the server (or rate-limited).
Proxies are gathered from https://advanced.name/freeproxy automatically
It's free, open source and based on free proxies
pip install
protatoquests
import requests
import protatoquests
# this one will contact the server directly
response = requests.get("https://google.com")
# this one will contact the server using an anonymous proxy
response = protatoquests.get("https://google.com")
Target Audience
Any developer that needs to serious web scraping.
It is not meant for production since it might leak credentials if the server is protected by authentication.
Comparison
There are some similar alternatives to do the same but they are outdated and they are not a drop-in replacement (you need to get proxies, pass it to library...), such as proxyscrape
trd1073@reddit
thanks for the work. in addition to the asyncio suggestion others put up, i had a few.
let the user choose ttl for the proxy cache to suit their needs/situation.
another person had suggested randomly choosing from the list of proxies. this is great idea, could even let user choose the behaviour (ie first, last, random, etc). one project i used used a list of proxies. those that just restarted the docker container when things stopped working had bad results. those that shuffled the proxies between restarts had drastically different results, ie things worked for the most part.
tacothecat@reddit
Great name
Fenzik@reddit
Fun little project. I see you are looping over the cached proxy list every time. Wouldn’t it make sense to shuffle them or draw a random one every time? Now all requests will go through the first proxy as long as it’s working instead of actually rotating. But if the first is blocked, subsequent requests will still try the first one before trying the second, wasting time.
FisterMister22@reddit
Nice, I've built somthing like that on my own (not free proxies, but not rotating either, I rotate them manually)
My question is, does it support async? With aiohttp
nicoloboschi@reddit (OP)
Definitely in the improvements list! Thanks for looking into it