how hard to build a custom browser engine for one website
Posted by MyNameIsSquare@reddit | learnprogramming | View on Reddit | 16 comments
i want to build from scratch a browser engine that only load youtube videos, it should be able to do the following:
- sign in a google account
- show urls of personalized suggested videos
- load videos and their statistics (descriptions, likes, dislikes, views, comments, likes and replies of comments)
- load livestreams and their live comments (not statistics)
- CLI to choose videos, like/dislike/comment videos, and do live comments
and to do that i think i need to implement these:
- google authentication
- web cookies and caches
- html parser
- how to do http requests/responses
- how to decompress data to get the video/audio
- how to render video/audio
since this is a personal project, i consider not implement these:
- security
- javascript/css engine
- all versions of html/http
is there anything else that i haven't mention? i can just embed existing browser engine into my app, but i want to know everything under the hood for educational purposes. many thanks in advance!!
AfterTheEarthquake2@reddit
You know, there's a reason why there are basically only 3 browser engines out there anymore. Building one that's able to run YouTube is gonna be a huge challenge. Are you sure you don't wanna start small and embed an existing browser?
MyNameIsSquare@reddit (OP)
i already limit the features to only load one website and it's still that hard? yikes
actually i dont really try to implement everything from scratch, i just want to know what a browser has to do to load a website, and if it only dedicates to one single website then is there anything to be optimized
EmperorLlamaLegs@reddit
A website isnt "a feature" that you are limiting yourself to.
You are only limiting your supported "features" to every single part of the standards that an incredibly complex web app relies on, while simultaneously ensuring that your app will break on nearly every update because the youtube engineers aren't testing on your platform.
Youtube also actively tries to break anything but the major browsers from working, because it wants to stop piracy and ad skipping. So to ensure that your browser engine works, you will have to make it nearly identical to the major browsers. Given that... what's the point of making your own?
Skusci@reddit
The trick is that you don't actually want to load a website which is hard. You want to display a YouTube video.
There are libraries like this that are meant to download a video from YouTube
https://github.com/yt-dlp/yt-dlp
The downside is that YouTube may change their API and it will break your program till it's updated because it can't do all the other things a browser can do like run JavaScript.
Then you just need to figure out how to display the video.
RaderPy@reddit
can't wait for ladybird to make it 4
peterlinddk@reddit
It doesn't really sound like you want to build a custom browser engine, but rather like you want to create an alternative Youtube App - so maybe accessing the Youtube API https://developers.google.com/youtube/v3 would better suit your needs, than going the long way round through a customized HTML browser ...
Just an idea ...
MyNameIsSquare@reddit (OP)
i will look into it, thx!
notislant@reddit
Just FYI a lot of the youtube API is a bitch to access. I wanted to use it for something, they accidentally gave me partial access to it without sending them a 'Power Point business plan'. I followed up with something and they removed my partial access, telling me I need to send them a power point business plan.
Was just trying to basically make a free version of those SEO keyword tools.
HashDefTrueFalse@reddit
Hard. I built a rubbish browser once. HTML parsing and rendering some basic elements isn't too bad, depends much much of it you want to support. I didn't support CSS or JS because that's tons of work I didn't want to do at the time. You don't need google auth. Streaming video is a learning curve.
You'll probably end up with something that displays basic HTML documents and call it a day because browsers have tons of complexity.
Internal_Outcome_182@reddit
No, it's not hard.. is uber-ultra-impossible-hard. You could even say new engines don't appear because we do not really know how.
HashDefTrueFalse@reddit
I don't think there's any merit to that whatsoever to be honest. We know how to built every constituent part, because we've done it. I very much doubt that in the short time since 1990 all of that knowledge has died along with the people who have, even more so without significant documentation being left behind. There's still plenty of programmers around from then. A group of people each possessing skill in some part (compiler/VM writing, rendering, network programming, etc.) could build a full-featured browser engine just fine given plenty of time and money.
The reason they don't and only a handful of full-featured ones exist is the same as for operating systems in my view (which I've also built a small one of)... it's hard, expensive, time-consuming, and good enough options already exist that it makes sense to just use them. If you're looking to build, do you built your own, or contribute to one of the existing ones? A form of network effect.
I liken this to when people say things like "we don't know how AI works" for impressive-sounding sound bites or headlines.. Yes, we do. (They're disingenuously referring to the fact that a big bag of weightings appears very opaque and ignoring that we devised the machinery of its creation)...
cgoldberg@reddit
You don't think there is any merit to calling it "hard"... you just think it's not commonly done because... it's "hard"? 👌
MyNameIsSquare@reddit (OP)
was your browser for general websites or just one/some specific site?
HashDefTrueFalse@reddit
It would display the HTML document it received as input, and you could point it at any HTTP URL. It would make a GET over a TCP socket in the normal way. I implemented POST but no other methods. Some elements that I didn't implement just wouldn't render. Some exotic ones would crash the parser, but most just emitted nothing. It was Visual C++ and OpenGL IIRC. It wasn't particularly impressive. Remember that this was in the 00s too, so HTML and webpages were simpler.
SamIAre@reddit
Why? There’s no benefit to this. Are you building an app? Just use whatever app framework you want and the YouTube API. Is it a website? Then just build a website. Idk what you’re trying to accomplish by reinventing the wheel but absolutely nothing you mentioned even comes close to needing that. Everything you mentioned in the “need to implement” section is just stuff that exists that you can use, with no downside and immense upside, for free in any framework you choose.
Super_Preference_733@reddit
5 years, 10 resources, and a 30 million dollar budget.