Web scraping publicly accessible university course info for a portfolio project — is this okay/legal
Posted by adad239_@reddit | learnprogramming | View on Reddit | 10 comments
I’m working on a personal portfolio project and wanted to sanity check if what I’m doing is okay.
Basically, I’m scraping publicly available university academic calendar pages (no login or credentials needed). From there, I extract program requirements, course descriptions, and prerequisites, and I also follow links between courses (e.g., required courses link out to their own pages with more details).
I store all of this in a database and build a graph of courses → prerequisites → program structure. On top of that, I use it to map a student’s transcript to their program, show what courses are available/locked, and generate basic course recommendations.
Just to be clear:
- this is strictly for a portfolio project
- I’m not monetizing it
- I’m not exposing or redistributing the raw scraped data
- it’s all from publicly accessible pages (no login required)
I just wanted to ask if this is generally considered okay/legal
Environmental_Gap_65@reddit
Web scraping itself isn’t inherently illegal, especially when you’re dealing with publicly accessible data. Where people tend to run into issues is when they violate a site’s terms of service, scrape content behind authentication, or copy protected content in a way that raises copyright or database rights concerns.
Even then, it’s often a civil/legal grey area rather than clearly criminal, and enforcement usually comes in the form of cease-and-desist requests rather than lawsuits.
Monetizing a project can increase legal risk depending on how the data is used, but for a small, non-commercial portfolio project like this, the practical risk is basically none — just make your scraper polite, implement rate limits and don't spam their servers.
Lawsuits over scraping do happen, but they’re relatively uncommon at the *enterprise level* for your personal project, they're basically not existing, if anyone asks you to stop, you stop, but I doubt that would ever happen honestly.
Mediocre_Half6591@reddit
you're probably good to go, most unis don't really care about portfolio scraping as long as you're not hammering their servers
just keep it reasonable with requests and maybe add some delays between calls so you don't accidentally ddos them or something. worst case they send you email asking to stop and you just comply
GlobalWatts@reddit
There absolutely can be issues regarding unauthorized access, or copyright violation.
But in most jurisdictions, and for most cases, there are no laws against it. It's entirely at the discretion of whomever operates the website and/or owns the data.
If they have a published Terms of Use, adhere to it. If they don't, ask them. If you're afraid to ask them because they might say no, then you already have your answer.
The stakes are obviously lower for a personal project. But also if the point is to demonstrate your capabilities, it's easier to choose a data source that you know is legally and ethically safe to use. You don't even have to use real data at all; web scraping isn't exactly a marketable skill, these days any idiot can script it with the help of an LLM. In a way they've already done the scraping for you.
CrabPresent1904@reddit
yeah i use developers for scraping projects like this
MeLittleThing@reddit
Check if the site has a robot.txt page.
Or ask them directly, so you'll know their policy about it
adad239_@reddit (OP)
Okay
MudkipGuy@reddit
If you aren't gaining unauthorized access to a computer system, it's not illegal to scrape. Public websites (ones that don't require an account for access) are fair game. You should still follow common courtesy to avoid being rate limited or ip banned
MeLittleThing@reddit
sorry, you what? I'm afraid that typing in google "What is ANAL" will return unwanted results
MudkipGuy@reddit
I am not a lawyer
Environmental_Gap_65@reddit
This guy anals.